africa of pinus patula forests in kwazulu-natal, south...

20
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tjss20 Download by: [UNIVERSITY OF KWAZULU-NATAL] Date: 19 June 2017, At: 01:02 Journal of Spatial Science ISSN: 1449-8596 (Print) 1836-5655 (Online) Journal homepage: http://www.tandfonline.com/loi/tjss20 Combining spectral and textural remote sensing variables using random forests: predicting the age of Pinus patula forests in KwaZulu-Natal, South Africa Michelle Dye , Onisimo Mutanga & Riyad Ismail To cite this article: Michelle Dye , Onisimo Mutanga & Riyad Ismail (2012) Combining spectral and textural remote sensing variables using random forests: predicting the age of Pinus patula forests in KwaZulu-Natal, South Africa, Journal of Spatial Science, 57:2, 193-211, DOI: 10.1080/14498596.2012.733620 To link to this article: http://dx.doi.org/10.1080/14498596.2012.733620 Published online: 03 Dec 2012. Submit your article to this journal Article views: 200 View related articles Citing articles: 1 View citing articles

Upload: others

Post on 07-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Full Terms & Conditions of access and use can be found athttp://www.tandfonline.com/action/journalInformation?journalCode=tjss20

Download by: [UNIVERSITY OF KWAZULU-NATAL] Date: 19 June 2017, At: 01:02

Journal of Spatial Science

ISSN: 1449-8596 (Print) 1836-5655 (Online) Journal homepage: http://www.tandfonline.com/loi/tjss20

Combining spectral and textural remote sensingvariables using random forests: predicting the ageof Pinus patula forests in KwaZulu-Natal, SouthAfrica

Michelle Dye , Onisimo Mutanga & Riyad Ismail

To cite this article: Michelle Dye , Onisimo Mutanga & Riyad Ismail (2012) Combining spectraland textural remote sensing variables using random forests: predicting the age of Pinuspatula forests in KwaZulu-Natal, South Africa, Journal of Spatial Science, 57:2, 193-211, DOI:10.1080/14498596.2012.733620

To link to this article: http://dx.doi.org/10.1080/14498596.2012.733620

Published online: 03 Dec 2012.

Submit your article to this journal

Article views: 200

View related articles

Citing articles: 1 View citing articles

Page 2: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Combining spectral and textural remote sensing variables using random forests:

predicting the age of Pinus patula forests in KwaZulu-Natal, South Africa

Michelle Dye*, Onisimo Mutanga and Riyad Ismail

Discipline of Geography, School of Environmental Sciences, Faculty of Science and Agriculture,University of KwaZulu-Natal, Scottsville, South Africa

In this study we examined the utility of a statistical technique, random forests, to combinespectral and texture variables to accurately predict the age of Pinus patula stands. Usingthe QuickBird panchromatic band (0.6 m), five texture variables were calculated using 12moving window sizes. The spectral variables used in this study consisted of the QuickBirdvisible and near infrared (NIR) bands (2.4 m). Using random forests, various methods ofcombining the spectral and texture variables were evaluated. The best model was based onrandom forests with a backward variable selection process which selected only fivevariables (NIR, green, variance with a 3 6 3 window, red and blue) of the original 64variables and obtained the best predictive accuracies (R2 ¼ 0.68).

Keywords: texture; random forests; backward variable selection; forest age; Pinus patula

1. Introduction

Forest age is an important component offorest inventory because it is indicative of anumber of forest conditions (Jensen et al.1999; Jakubauskas & Price 2000; Franklinet al. 2003). Forest biophysical attributes,such as leaf area index (LAI), forest growthrate, canopy cover and net primary produc-tion, are often related to forest age, whichcan be used as a surrogate for thesevariables (Ahern et al. 1991; Danson &Curran 1993; Nieman 1995). Studies haveshown that multispectral remote sensingimagery is a very useful tool for discrimi-nating forest age (Table 1). A commonapproach is to examine the spectral reflec-tance of the stand (Jensen et al. 1999;Gerylo et al. 2002; Franklin et al. 2003;Johansen et al. 2007; Gebreslasie et al. 2008;van Aardt & Norris-Rogers 2008; Tomppo

et al. 2009). Researchers have shown thatthe spectral response of a tree changes as itgets older due to changes in chlorophyllcontent and the internal structure of theplant (Jensen et al. 1999). In general, thereis an inverse relationship between spectralreflectance and forest age because as treesgrow older, competition for light, water andnutrients causes weaker trees to die off. Thisresults in (i) a decrease in stems per hectare,and (ii) an increase in the size and visibilityof shadows (Ahern et al. 1991; Danson &Curran 1993; Nieman 1995; Jensen et al.1999; Jakubauskas & Price 2000; Geryloet al. 2002; Franklin et al. 2003). Spatialtechniques, such as texture analysis, providean alternative to spectral analysis by using acombination of shape, size and spectraldata to classify image data (Haralick et al.1973). The increased textural informationavailable in fine spatial resolution image

*Corresponding author. Email: [email protected]

Journal of Spatial Science

Vol. 57, No. 2, December 2012, 193–211

ISSN 1449-8596 print/ISSN 1836-5655 online

� 2012 Surveying and Spatial Sciences Institute and Mapping Sciences Institute, Australia

http://dx.doi.org/10.1080/14498596.2012.733620

http://www.tandfonline.com

Page 3: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Table

1.

Key

studiesusingmultispectralremote

sensingim

ageryformappingforest

age.

Reference

Sensor

(resolution)

Age

(yrs)

Species

Method

Input

variables

Overall

accuracy

(%)

Number

of

classes

Jensenet

al.

(1999)

LandsatTM

(30m)

4–40

Loblollypine(P

inus

taeda)

Regressionanalysis

usingmultiple

regressionand

artificialneural

networks

Spectralbands

98.2%

Jakubauskas

and

Price

(2000)

Landsat

5TM

(30m)

0–250

Lodgepole

pine

(Pinuscontorta)

Regressionanalysis

usingstepwise

multiple

regression

Spectralbands

90%

6

Gerylo

etal.

(2002)

Landsat

5TM

(30m)

21–198

Jack

pine(Pinus

banksiana)

TremblingAspen

(Populus

trem

uloides)

Regression

analysisusing

stepwisemultiple

regression

Spectralbands

75%

(Jack

pine)

44%

(Trembling

Aspen)

Franklinet

al.

(2003)

Landsat

5TM

(30m)

21–198

Whitespruce

(Picea

glauca)

Jack

pine(Pinus

banksiana)

Classificationusing

lineardiscrim

inant

analysis(LDA)

Texture

images

andspectral

bands

92%

(Jack

pine)

63%

(White

spruce)

2

Kayitakireet

al.

(2006)

IKONOS-2

(4m)

27–110

Norw

ayspruce

(Picea

abies)

Regressionanalysis

usingsimple

linear

model

Texture

images

81%

4

Johansenet

al.

(2007)

QuickBird

(2.4

m)

Young

Mature

Old

Western

hem

lock

( Tsuga

heterophylla)

Western

redcedar

(Thuja

plicata)

Amabilisfir(A

bies

amabilis)

Yellow

cedar

(Chamaecyparis

nootkatensis)

Sitkaspruce

(Picea

sitchensis)

Classificationusing

object-oriented

classification

algorithm

Texture

images

andspectral

bands

78.9%

7

(continued)

194 M. Dye et al.

Page 4: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

data allows for improved interpretationbased on the shape and texture of groundfeatures.

Texture analysis is useful for predictingforest age because structural changes in thestand cause variations in image texture andallow for relationships to be developedbetween age and canopy characteristics(Johansen et al. 2007). Studies have shownthat image texture can successfully discri-minate between forest age classes (Cohenet al. 1990; Franklin et al. 2001; Kayitakireet al. 2006), but more importantly, imagetexture has also been used in conjunctionwith spectral reflectance to provide addi-tional discriminatory power. For example,Johansen et al. (2007) combined spectralreflectance and image texture derived fromhigh spatial resolution imagery to discrimi-nate between the different stages of riparianforest growth. The inclusion of imagetexture improved the classification accuracyof vegetation classes between 2% and 19%.Similarly, Franklin et al. (2000) combinedimage spectral reflectance and texture toclassify the composition of forest species inAlberta and New Brunswick. Resultsshowed that the inclusion of image textureincreased the overall classification accuracyby 5% in Alberta, and 12% in NewBrunswick. Wunderle et al. (2007) usedpan-sharpened SPOT 5 imagery to estimateforest stand structure. Image texture wasincluded in the study to complement thespectral response data and increase modelaccuracy. A stepwise multivariate regres-sion analysis was performed using both thespectral and texture variables and resultedin a high model accuracy (R2 ¼ 0.79).

However, remote sensing studies thathave combined spectral and texture vari-ables have tended to use more traditionalstatistical techniques. Frequently usedmethods include linear regression models(Jakubauskas & Price 2000), and lineardiscriminant analysis (Moskal & Franklin2001; Franklin et al. 2003; Zhang et al.T

able

1.

(Continued).

Reference

Sensor

(resolution)

Age

(yrs)

Species

Method

Input

variables

Overall

accuracy

(%)

Number

of

classes

Douglasfir

(Pseudotsuga

menziesii)

Red

alder

(Alnus

rubra)

Wunderle

etal.(2007)

SPOT-5

(10m)

0–50

Lodgepole

pine

(Pinuscontorta)

Whitespruce

(Picea

glauca)

Balsam

fir(A

lbies

balsamea)

Tremblingaspen

(Populus

trem

uloides)

Regression

analysisusing

stepwisemultiple

regression

Texture

images

andspectral

bands

74%

3

Journal of Spatial Science 195

Page 5: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

2004). Linear models require normal dis-tribution, linearity and the absence ofcollinearity amongst input variables (Jensenet al. 1999). However, ecological data areoften very complex and nonlinear interac-tions may exist between the observed dataand the remotely sensed data (De’ath &Fabricius 2000). For example, in the studyby Jensen et al. (1999), a nonlinear relation-ship was shown to exist between the age ofloblolly pine and near infrared (NIR)reflectance, with NIR reflectance decreasingat a faster rate as the tree matured. There-fore, more robust methods are required tohandle remotely sensed data and theirinteractions with the response data (i.e.forest age).

Regression trees (Breiman et al. 1984)have been recommended to overcome thelimitations of linear-based models (Prasadet al. 2006) and have been widely used forprediction purposes in the remote sensingdomain (Michaelson et al. 1994; DeFrieset al. 1997; Hansen et al. 2002; Lobell et al.2007). The method is popular amongstremote sensing researchers because (i) theinput data can be continuous or discrete, (ii)the input data do not need to be normallydistributed, and (iii) non-linear relation-ships between predictor variables and ob-served data can be modeled (De’ath &Fabricius 2000; Prasad et al. 2006; Bel et al.2009). However, regression trees are sensi-tive to small variations in the trainingdataset (Breiman 2001). This can lead toinstability with regard to variable selection,and can adversely affect the predictiveperformance of the final model (Elithet al. 2008). Consequently, ensemble meth-ods such as random forests have beendeveloped to reduce the instability of singleregression trees and improve the overallpredictive performance.

Random forests (Breiman 2001) are acombination of tree predictors where treesare grown without pruning, and the result isbased on the average of all the trees. Trees

are grown based on a bootstrap sample(sample with replacement) of the originaltraining data, providing an internal esti-mate of the model’s predictive accuracy.The best split at each node of the tree issearched only among a randomly selectedsubset of predictors (Breiman 2001).

Random forests have several advantagesover other statistical techniques. The en-semble can handle both categorical andcontinuous predictors and has the flexibilityto perform several types of statisticalanalysis including regression, classification,survival analysis and unsupervised learning(Cutler et al. 2007). By growing a largenumber of trees, random forests do notoverfit the data and the random predictorselection keeps bias low (Prasad et al. 2006).What makes random forests particularlypowerful in comparison to other statisticalmethods is the ability of the ensemble tocope with complex interactions and highlycorrelated predictor variables. Further-more, random forests can be used wherethere are more variables than observations(small n, large p problem) (Strobl & Zeileis2008). Random forests also provide mea-sures of variable importance to help withmodel interpretation. While other machinelearning methods such as neural networkshave been successfully used to accuratelypredict forest age (Jensen et al. 1999;Gebreslasie et al. 2008), these algorithmsdo not offer an internal measure of variableimportance to provide insight regarding thevariables that would best contribute to thefinal model (Archer & Kimes 2008).

The main objective of this paper is todetermine whether random forests can beused to combine spectral and textureremote sensing variables to accurately pre-dict the age of P. patula stands in KwaZulu-Natal, South Africa. The ability to mapforest age using remote sensing techniqueswhich incorporate spectral reflectance andimage texture will be of benefit to forestmanagers as it provides an alternative to

196 M. Dye et al.

Page 6: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

expensive and time-consuming field techni-ques. Remote sensing methods make itpossible to assess large areas in a shortspace of time, and allow for completecoverage of an area.

Though past studies have successfullydiscriminated forest age using spectralapproaches, these studies have tended touse lower resolution imagery. Jensen et al.(1999), Jakubauskas and Price (2000) andGerylo et al. (2002) all used 30 m Landsatimagery. This is also evident in studies thatcombined spectral and texture variables.Franklin et al. (2003) used 30 m Landsatimagery, Wunderle et al. (2007) used SPOT-5 (10 m) and Johansen et al. (2007) usedQuickBird 2.4 m imagery. With the in-creased availability of high spatial resolu-tion imagery, image texture is becoming auseful source of information in forestryapplications. This study used high spatialresolution imagery (2.4 m for spectral vari-ables and 60 cm for texture variables) totest texture and spectral combination meth-ods for forest age prediction. Variouscombinations of spectral and texture vari-ables were evaluated based on window size,texture variable, principal component ana-lysis and a backward variable selectionprocedure using ranked variable impor-tance calculated by random forests.

2. Materials and methods

Study area

The study area (Figure 1) consists of6391 ha of commercial forestry and formspart of the Hodgsons Sappi plantation,which is located near Greytown in KwaZu-lu-Natal, South Africa (centroid: latitude2981304200S longitude 3082905600E). Thestudy area falls under the midlands mistbelt grassland bioregion unit as defined byMucina and Rutherford (2006). The areaexperiences summer rainfall and has a meanannual precipitation of 915 mm (range 730–1280 mm). Some of the winter, spring and

early summer precipitation is in the form ofcold front activity. Frequent and heavy mistprovides significant amounts of additionalmoisture. The mean annual temperature is15.88C. The dominant soils in the studyarea are apedal and plinthic soil formsderived mostly from the Ecca group. Thelandscape is classified as hilly and rolling(Mucina & Rutherford 2006) with elevationranging from 1030 m to 1590 m above sealevel. The study site contains same-agedstands (compartments) of Acacia, Eucalyp-tus and Pinus species. However, the major-ity of the stands consist of P. patula treesthat are grown under a pulpwood manage-ment regime. The majority of the stands arebetween 1 and 25 years old, with harvestingtypically occurring when the P. patula treesare 25 years old (Owen 2000).

Data acquisition

QuickBird multispectral (2.4 m resolution)and panchromatic (0.6 m resolution) ima-gery was acquired on 10 September 2008under cloudless conditions. The imageswere georectified using the random poly-nomial correction (RPC) QuickBird model(Johansen et al. 2007). RPCs were calcu-lated from a digital photogrammetry tech-nique that uses a collinearity equation toconstruct sensor geometry (ENVI 2006).The RPC model uses sensor information aswell as elevation data to orthorectify animage. The digital elevation model (DEM)used in the rectification process was createdfrom 5 m contours of the study area.Following Johansen et al. (2007), the imagewas also radiometrically corrected usingpre-launch calibration coefficients as de-scribed in the ENVI 4.3 image processingsoftware.

Field data

Field data for the study area were suppliedby Sappi, a paper and pulp company. Field

Journal of Spatial Science 197

Page 7: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

data consisting of species composition, age,stems per hectare, mean diameter at breastheight and mean tree height were collectedby the forestry company using industrystandard enumeration techniques (Owen2000). Following Jensen et al. (1999), thefield data served as ground reference datathat were used to predict P. patula age.Stands that were 3 years and younger werenot used in the study due to their lowcanopy closure, which could result in the

spectral mixing of pixels (Jensen et al.1999). After removing outliers, such asnewly harvested stands, the final datasetconsisted of 142 samples (1214 ha). Ageranged from 4 to 24 years; however, themajority of the samples consisted of young-er stands (4 to 12 years) (Figure 2).

Using Hawth’s analysis tools (www.spatialecology.com/htools/overview.php) inArcGIS 9.1 (ESRI 2006), 70% of the sampleswere randomly selected and used as a

Figure 1. Location of the study area. The Pinus patula stands (n ¼ 142) in the study area arehighlighted in red.

198 M. Dye et al.

Page 8: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

training dataset while the remaining 30% ofthe samples were used to test the final model.

Texture analysis

Haralick et al. (1973) defined 14 texturestatistics that are derived from the grey-levelco-occurrence matrix (GLCM). The ap-proach describes the probability of anygrey level occurring spatially relative toany other grey level within a movingwindow. The probabilities are stored in aGLCM and statistics are applied to thematrix to generate texture features, whichare assigned to the centre pixel of thewindow. Johansen et al. (2007) used textureanalysis and spectral reflectance data todiscriminate between old and young forestsin British Columbia, Canada (Table 1).Second-order contrast, dissimilarity andhomogeneity provided the most significantdiscrimination between the age classes.Kayitakire et al. (2006) utilized second-order variance, contrast and correlation todiscriminate the age of common sprucestands in Belgium. Therefore, based onfindings of Johansen et al. (2007) andKayitakire et al. (2006), the following fiveco-occurrence texture measures were

calculated from the QuickBird panchromaticimage: variance, contrast, correlation, homo-geneity and dissimilarity. Table 2 provides adetailed description of the texture measuresused in this study.

Window sizes are an important compo-nent of a texture analysis because texture isa multi-scale phenomenon (Moskal &Franklin 2001). Using small window sizescould result in poorly sampled co-occurringprobabilities and an inconsistent estimate ofindividual texture measures; while focusingon only larger window sizes could result inthe eroding of class boundaries (Jobanputra& Clausi 2006). It is therefore necessary touse a range of small, medium and largewindow sizes. Johansen et al. (2007) showedthat semivariograms can be used to obtainthe optimal window size for a textureanalysis. However, in this study we fol-lowed the recommendation of Moskal andFranklin (2001) by calculating texture usingmultiple window sizes. The texture vari-ables used in this study were computedusing 12 window sizes of 3 6 3 to 25 6 25pixels in increments of 2 and the mean valuefor each sample (n ¼ 142) was extractedusing the zonal statistics functionality inArcGIS 9.1 (ESRI 2006).

Figure 2. Age distribution of the Pinus patula stands (n ¼ 142) located in study area.

Journal of Spatial Science 199

Page 9: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Table 2. Co-occurrence texture measures used in this study (adapted from Dye et al. 2008).

Texture Formula DescriptionExample

(3 6 3 window)

Color composite ofthe corresponding

images

Contrast PN�1i;j¼0

Pi;jði� jÞ2Contrast is a measure

of the overallamount of localvariation in awindow (i.e., it isproportional to therange of greylevels) (Yuan et al.1991).

Dissimilarity PN�1i;j¼0

Pi;jji� jjThe dissimilarity

measure is similarto the contrastmeasure. However,where contrastweights increaseexponentially (0, 1,4, 9, etc.) as onemoves away fromthe diagonal,dissimilarityweights increaselinearly (0, 1, 2, 3etc.) (Hall-Beyer,2010).

Homogeneity PN�1i;j¼0

Pi;j

1þði�jÞ2Homogeneity

measures thesmoothness ofimage texture.Large changes inspectral values willresult in very smallhomogeneityvalues, while smallchanges will resultin largerhomogeneityvalues (Tuttleet al. 2006).

Variances2i ¼

PN�1i;j¼0

Pi;jði� miÞ2

s2j ¼PN�1i;j¼0

Pi;jðj� mjÞ2

Accounts for thevariability of thespectral responseof pixels (Tuttleet al. 2006) butconsiders thepairwisecombinations ofvariability.

Correlation PN�1i;j¼0

Pi;jði�miÞði�mjÞffiffiffiffiffiffiffiffiffiffiffiffiðs2

iÞðs2

p" #

The correlationtexture algorithmmeasures the greylevel linear-

(continued)

200 M. Dye et al.

Page 10: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Random forests for regression applications

Random forests (Breiman 2001) grow manyregression trees without statistical pruning,and the result is based on the average of allthe regression trees. Individual regressiontrees in the forest are built using bootstrapaggregation (bagging), which involves ran-domly drawing, with replacement, a boot-strap sample of the original training dataset.When a bootstrap sample is drawn, approxi-mately one third of the data are excludedand the tree then makes predictions on theexcluded samples to bring the sample to fullsize (Prasad et al. 2006). The excluded onethird of the samples is known as the ‘out-of-bag’ (OOB) samples, while the replicateddataset is known as the ‘in-bag’ samples.Given that the OOB samples were not usedin the training process, calculating the meansquare error (MSE) using the OOB samplesprovides an unbiased assessment on themodel’s predictive accuracy. By aggregatingthe OOB predictions of all trees in the forest,the MSE can then be estimated (Liaw &Wiener 2002) as follows:

MSEOOB ¼

Pni¼1ðYi � Y

^i

OOBÞ

n

2

ð1Þ

where n is the number of samples, Yi is theobserved forest age and Yi

OOB is theaverage of the OOB predictions for the ithobservation.

In addition to bagging, each tree split isbased on a random subset of the inputvariables (Breiman 2001). Therefore, therandomness introduced in the dataset selec-tion (i.e. bagging) and in the variableselection (mtry) makes random forests anaccurate tool for prediction. Additionally,there are only two tuning parametersrequired for growing random forests, thenumber of trees to be grown (ntree), and thenumber of possible splitting variables(mtry) which are sampled at each node.However, researchers have shown thatsensitivity of the user-defined parametersis minimal and the default values are often agood choice (Liaw & Wiener 2002; Lawr-ence et al. 2006; Ismail & Mutanga 2010).We used the R statistical software (RDevelopment Team 2008) and the randomforest library (Liaw & Wiener 2002) for allstatistical analysis.

Random forests: variable importance

Random forests are often referred to as a‘black box’ (Prasad et al. 2006) because ofthe limited interpretability of the finalmodel (i.e. the results are based on theaverage of many trees in the forest).Consequently, random forests also producea measure which ranks the variables ac-cording to their importance. Simply stated,the variables associated with the OOBsample are randomly permuted and regres-sion trees are grown on the modified

Table 2. (Continued).

Texture Formula DescriptionExample

(3 6 3 window)

Color composite ofthe corresponding

images

dependency withinthe image(Kayitakire et al.2006)

P is the texture index and i and j refer to adjacent texture pixels.

Journal of Spatial Science 201

Page 11: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

dataset. The importance measure of eachvariable is then calculated as the differencein the MSE between the original OOBdataset and the modified dataset (Breiman2001). It follows that the MSE will decreasesubstantially if the original variable (i.e.texture or spectral variables) was associatedwith the response variable (i.e. forest age).Therefore, the difference in MSE before andafter permuting the variables can be used tomeasure the importance of variables used inthe final random forests model. A keyadvantage of the random forests variableimportance is that it not only deals with theimpact of each variable individually, butalso looks at multivariate interactions withother variables (Strobl & Zeileis 2008).

Random forests as a framework forincorporating texture and spectral data

A key objective of this paper was thedevelopment of a framework for the com-bination of spectral and texture variables.Numerous combinations of variables weretested during the initial data analysis (vari-ables with highest individual correlations toforest age, all texture variables from bestwindows combined with spectral variables,etc.). The results were promising but inorder to find the optimal combinationmethod an existing framework (Puissantet al. 2005) was tested on our dataset.Figure 3 shows the four methods thatPuissant et al. (2005) used to combinespectral and texture variables in an effortto improve classification accuracy. The firstmethod involved combining spectral andtexture variables according to window size(for example 3 6 3 contrast, 5 6 5 con-trast and so on) while the second methodcombined spectral and texture variablesaccording to the optimal texture measure(for example 3 6 3 contrast, 3 6 3 var-iance and so on). In total, 12 randomforests were grown based on the firstmethod (i.e. optimal window size) and five

random forest models were developedbased on the second method (i.e. optimaltexture measure). Due to the fact that thewindow sizes and texture measures arehighly correlated (St-Louis et al. 2006),Puissant et al. (2005) transformed thetexture variables used in method one andmethod two using principal componentanalysis (PCA). PCA is a data reductiontechnique which uses a linear transforma-tion of a set of numerical variables in orderto create a new set of components that areuncorrelated and ordered in terms of theamount of variance explained in the origi-nal data (Eastman & Fulk 1993). The firstthree components which explained 98% ofthe variance from method one and methodtwo were subsequently extracted and com-bined with the spectral variables.

In the original framework developed byPuissant et al. (2005) the coefficient ofvariation was used to select the optimalmodel for each method. However, in thisstudy we used the internal measure of errorcalculated by random forests to select theoptimal model from the methods proposedby Puissant et al. (2005). Generally, randomforests use the MSE to assess the accuracyof a model; however, since MSE is scaledependent, model selection was based onthe normalized out-of-bag (NOOB) error(Grimm et al. 2008), which is calculated asfollows:

NOOB error ¼ MSEOOB

VARðYkÞ ð2Þ

where VAR(Yk) is the variance of theresponse variable (forest age). MSEOOB

(Equation 1) refers to the mean squareerror as determined by the out-of-bagsamples.

In addition to the combination frame-work developed by Puissant et al. (2005),we propose two additional methods (meth-od 5 and method 6) for combining textureand spectral variables (Figure 3). Method 5

202 M. Dye et al.

Page 12: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

combines all the texture (window sizes andtexture measures) and spectral variables(n ¼ 64) using random forests, and exam-ines the final accuracy as determined by theNOOB error. Method 6 is similar tomethod 5 but uses random forests withbackward variable selection. Initially, meth-od 6 grows random forests using all thevariables (n ¼ 64), ranks the importance ofeach variable and then calculates the modelperformance using the NOOB error. Themodel is then run again by dropping theleast important variable, and model perfor-mance is again calculated. This processcontinues until there are no variables left.Subsequently, the model with the lowestNOOB error is the best model with the mostsignificant variables (Kuhn 2009). Therationale of using the backward variableselection with random forests is that pre-diction accuracy will stay relatively con-stant when unimportant variables are

dropped, but will decrease when relevantones are excluded (Strobl & Zeileis 2008).The backward variable selection processwas carried out in R using the CARETpackage (Kuhn 2009).

3. Results

Combining spectral and texture variablesusing optimal window size

We tested the utility of the first method(Figure 3) by combining the spectral vari-ables with different texture measures thatwere derived using the same window size.As mentioned earlier, texture measures(n ¼ 5) were calculated using 12 windowsizes. The texture measures (e.g. 3 6 3contrast, 3 6 3 variance, 3 6 3 correlationand so on) were then combined with thespectral variables (n ¼ 4). From the 12models that were created (Table 3a), thetexture measures calculated using a 3 6 3

Figure 3. Framework for combining texture and spectral remote sensing data. PCA refers to theprincipal components analysis.

Journal of Spatial Science 203

Page 13: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

window size, when combined with thespectral variables, performed best (30.45%NOOB error).

The first three principal components ofthe texture measures that were derived usingthe same window size were also combinedwith the spectral variables (method 3). In themajority of the models, combining thespectral variables with the principal compo-nents of the texture variables producedbetter results than using the original texturevariables. Table 3a shows that eight out ofthe 12 models that used principal compo-nents of the texture variables yielded lowerNOOB error rates. Overall, the principalcomponents of the texture measures calcu-lated using a 3 6 3 window when combinedwith the spectral variables produced thelowest NOOB error (28.78%).

Combining spectral and texture variablesusing optimal texture measure

We then combined the spectral variableswith individual texture measures calculatedat different window sizes (e.g. 3 6 3 con-trast, 5 6 5 contrast, 7 6 7 contrast andso on). From the five models that werecreated (Table 3b), the model that used the

variance texture measure produced thelowest NOOB error (30.71%). The NOOBerror rates for the other texture measureswere as follows: contrast (35.67%) anddissimilarity (36.63%), homogeneity(37.64%) and correlation (39.22%).

The first three principal components ofindividual texture measures calculated atdifferent window sizes were also combinedwith the spectral variables (method 4).Results show that four out of the fiveprincipal components models producedbetter results than using the original texturevariables. Table 3b shows that the modelthat used the principal components of thecontrast texture measure produced the bestresult (33.75%) while the principal compo-nents of variance texture measure whencombined with the spectral variablesyielded the worst result (34.57%).

Combining all the spectral and texturevariables using random forests

Method 5 of the proposed frameworkcombined all the spectral and texturevariables (n ¼ 64) using random forests.Using the default parameters (mtry ¼ 21;ntree ¼ 500) random forests produced aNOOB error of 34.30%. Figure 4 shows thetop eight variables ranked by importance.The NIR band was the highest rankedvariable by the random forests, followed bythe green band and the variance texture

Table 3a. Combining the spectral variableswith various window sizes calculated from theoriginal data and the principal components ofthe various window sizes.

Window sizeCombined spectraand window size

Principalcomponents

3 6 3 30.45% 28.78%5 6 5 31.84% 32.94%7 6 7 32.32% 34.84%9 6 9 32.25% 33.72%11 6 11 31.15% 32.45%13 6 13 33.27% 30.32%15 6 15 32.60% 31.05%17 6 17 33.59% 31.44%19 6 19 33.78% 29.90%21 6 21 34.67% 30.33%23 6 23 35.03% 31.09%25 6 25 35.59% 32.45%

Table 3b. Combining the spectral variableswith various texture measures using the originaltexture images and the principal components ofthe texture images.

Texturemeasure

Combined spectraand texturemeasures

Principalcomponents

Contrast 35.67% 33.75%Correlation 39.22% 34.03%Dissimilairity 36.63% 34.17%Homogeniety 37.64% 34.02%Variance 30.71% 34.57%

204 M. Dye et al.

Page 14: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

measure calculated using a 3 6 3 window.Excluding the variance texture measure,the spectral bands are more highlyranked than the texture variables used inthis study.

Backward variable selection

Method 6 of the proposed frameworkcombined all the spectral and texturevariables (n ¼ 64) using random forests(mtry ¼ 21; ntree ¼ 500) and a backwardvariable selection process. Using the rankedvariable importance (Figure 4), the leastimportant variable was dropped and theNOOB error was recalculated. This processwas repeated 64 times until there were nomore variables to drop (Figure 5). Resultsshowed that the best model obtained aNOOB error of 27.72% while using onlyfive variables out of the 64 original vari-ables (Table 4). The five variables selectedby the backward variable selection were theNIR, green, variance (3 6 3 window), redand blue variables. Noticeable from Figure5 is that using all the variables (texture andspectral) does not necessarily improve the

model’s predictive accuracy; rather thereare an optimal number of variables thatproduce the best results.

Accuracy assessments

Ideally, model performance is calculatedusing a large independent test dataset thatwas not used in the training procedure(Grimm et al. 2008). Therefore, the pre-dictive accuracy (R2) of the model thatproduced the lowest NOOB error wastested using 30% of the dataset (n ¼ 43)that was excluded from the modelingprocess. Overall, the best model with thelowest NOOB error (27.72%) was based onthe backward variable selection method(method 6) and yielded an R2 value of0.68. For comparative purposes we alsocalculated the predictive accuracy of therandom forests that used all the texture andspectral variables. It is interesting to notethat the backward variable selection meth-od produced a 5% increase in predictiveaccuracy when compared to the randomforests that used all the texture and spectralvariables (R2 ¼ 0.63).

Figure 4. Ranked variable importance as determined by random forests. For brevity only the topeight variables are shown.

Journal of Spatial Science 205

Page 15: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

4. Discussion

This study has shown the potential ofrandom forests for the evaluation of spec-tral and texture variables to determinewhich bands are the most significant forthe prediction of P. patula. Random forestsprovide an ideal framework for the integra-tion of spectral and texture data because,unlike traditional linear methods whichrequire certain statistical assumptions tobe met, random forests are robust and canhandle complex remotely sensed data.

Window size versus texture measure

An existing framework for combiningspectral and texture variables (Puissantet al. 2005) was evaluated using randomforests. The models that combined indivi-dual texture measures calculated at variouswindow sizes with the spectral data yieldedbetter results than the models that

combined various texture measures accord-ing to single window size. The results fromthis study reiterate the importance ofwindow sizes when implementing textureanalysis (Moskal & Franklin 2001). Smallerwindow sizes will capture the texturalinformation of individual trees, while thelarger window sizes will describe the textur-al characteristics of forest stands. In thisstudy the variance texture measure using a3 6 3 window size proved to be the mostsuitable for forest age prediction. This isperhaps due to the fact that the trees in thestudy area are grown for pulpwood and aretherefore planted close together. On atypical growth cycle for pulpwood, P.patula trees are planted at 1200 stems perhectare (SPH). Even after 25 years the SPHis still high at 950 (Owen 2000). Pastresearch has shown that smaller windowsizes are often more appropriate as theyprovide more detailed textural informationof individual trees. Similar results were

Figure 5. Results of the backward variable selection method and the associated normalised out-of-bag error. The arrow indicates the lowest error obtained using the following five variables: NIR, green,variance (3 6 3 window), red and blue.

206 M. Dye et al.

Page 16: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

obtained by Johansen et al. (2007) andWunderle et al. (2007). Johansen et al.(2007) used semivariograms to assess whichwindow sizes were most appropriate for theseparation of vegetation structural stages.Results showed that the 3 6 3 and11 6 11 window sizes were most appro-priate for structural class separation. The3 6 3 window size was most suitable fordiscrimination between old and youngforest classes. According to Wunderleet al. (2007) there is a relationship betweenforest structure and textural measures usingthe larger window size. However, thesmaller window size adds the most informa-tion when estimating forest stand structure.

Original texture data versus principalcomponents

It follows that by implementing PCA, thedata redundancy inherent in the textureimages decreases and the information iscompressed into a number of uncorrelatedcomponents (Ricotta et al. 1999). Theability to deal with correlated variables isespecially pertinent in this study since thevarious texture measures calculated atmany window sizes are highly correlated(St-Louis et al. 2006). Studies have shownprincipal components calculated from tex-ture variables can reduce correlation andimprove model performance. For example,Johansen et al. (2007) calculated principalcomponents of texture variables and foundthat the principal components producedadequate classification accuracies and werebeneficial for compressing image data forclassification purposes. Additionally, in thestudy by Puissant et al. (2005), the calcula-tion of principal components from theoptimal texture measure resulted in a higherclassification accuracy. Results from thisstudy confirm that using the principalcomponents of the various texture variablesyielded better results than using the originaltexture variables with random forests.

Backward variable selection and randomforests

Overall, the best model was obtained usingbackward variable selection with randomforests (method 6). The proposed methodproduces a high predictive accuracy(R2 ¼ 0.68) when compared to using allthe spectral and texture variables withrandom forests (R2 ¼ 0.63), or any of theexisting methods proposed by Puissant et al.(2005). Additionally, the method simplifiesthe modelling process by identifying thesmallest number of input variables that offerthe best discriminatory power and aid in theempirical interpretation of the final model.

Results from this study show that onlyfive variables (7.8% of original 64 variables)were used in the final model (NIR, green,variance with a 3 6 3 window, red andblue). The spectral variables, especially theNIR and green bands, performed well in thefinal model. This is because green plantscharacteristically absorb visible electromag-netic radiation and strongly scatter in theNIR bands (Curran 1980). Leaf reflectanceis determined by concentrations of plantpigments such as chlorophyll. Younghealthy plants will contain higher concen-trations of chlorophyll a and b while olderplants will contain less (Schmidt 2003).

The variance texture measure calculatedusing a 3 x 3 moving window was thetexture measure included in the model. Apossible explanation for the poor contribu-tion of the other texture variables is thehomogeneous nature of the pulpwoodstands. Since the trees are grown moredensely, the compartments are homoge-neous in terms of their structure and thisis perhaps why the texture analysis did notperform as well as expected. In contrast, thevariance statistic is considered to be rele-vant for forest age discrimination because itaccounts for all pair-wise combinations ofvariability and can therefore detect subtlechanges in image texture that occur as a

Journal of Spatial Science 207

Page 17: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

result of plant growth (Kayitakire et al.2002).

A partial response plot (Figure 6) showsthat younger trees have higher variance.This is because they have a more opencanopy which results in a higher variationbetween pixels. Older stands on the otherhand have lower variance as they arestructurally complex and characterized bya patchy upper canopy with coarse woodydebris on the soil surface in all stages ofdecomposition (Johansen et al. 2007).

5. Conclusion

Random forests were found to be a usefuland robust tool for combining spectral and

texture remote sensing data for forest ageprediction. Testing various combinations ofspectral and texture variables in a randomforests environment showed that principalcomponents of the texture datasets pro-duced better results than simply using theoriginal texture images. Furthermore, usingdifferent window sizes to combine spectraland texture variables was more successfulthan using texture measures. The bestmodel was achieved using the randomforests backward variable selection method,which reduces error and simplifies themodelling process by selecting only themost important variables to include in thefinal model. Only five variables were used(NIR, green, variance with a 3 6 3

Figure 6. A partial plot of the variance 3 6 3 texture measure and forest age.

Table 4. Summary of the normalized out-of-bag error (NOOB) for the best models for variousmethods evaluated in this study.

MethodNumber of

models createdTexture variables used

in the best modelNOOB error ofthe best model

1 12 3 6 3 window 30.45%3 12 PCA of 3 6 3 window 28.78%2 5 Contrast 33.75%4 5 PCA of variance 30.71%5 1 All the texture measures 34.30%6 64 3 6 3 variance 27.72%

208 M. Dye et al.

Page 18: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

window, red and blue) and the final modelyielded a predictive accuracy of R2 ¼ 0.68.

Acknowledgements

The authors would like to thank Sappi forproviding the QuickBird imagery and field data.

References

Ahern, F.J., Erdle, T., Maclean, D.A., &Kneppeck, I.D. (1991) A quantitative rela-tionship between forest growth rates andthematic mapper reflectance measurements.International Journal of Remote Sensing, vol.12, pp. 387–400.

Archer, K.J., & Kimes, R.V. (2008) Empiricalcharacterization of random forest variableimportance measures.Computational Statisticsand Data Analysis, vol. 52, pp. 2249–2260.

Bel, L., Allard, D., Laurent, J.M., Chaddadi, R.,& Bar-Hen, A. (2009) CART algorithmfor spatial data: Application to environmentaland ecological data. Computational Statisticsand Data Analysis, vol. 53, pp. 3082–3093.

Breiman, L., Friedman, J.H., Olshen, R.A., &Stone, C.J. (1984) Classification and Regres-sion Trees, Wadsworth International Group,Belmont.

Breiman, L. (2001) Random forests. MachineLearning, vol. 45, pp. 5–32.

Chan, J.C.-W., & Paelinckx, D. (2008) Evaluationof Random Forest and Adaboost tree-basedensemble classification and spectral bandselection for ecotope mapping using airbornehyperspectral imagery. Remote Sensing ofEnvironment, vol. 112, pp. 2999–3011.

Cohen, W.B., Spies, T.A., & Bradshaw, G.A.(1990) Semivariograms of Digital Imagery forAnalysis of Conifer Canopy Structure.RemoteSensing of Environment, vol. 34, pp.167–178.

Curran, P. (1980) Multispectral remote sensingof vegetation amount. Progress in PhysicalGeography, vol. 4, pp. 315–341.

Cutler, R.D., Edwards, T.C., Beard, K.H., Cutler,A.,Hess,K.T.,Gibson, J.,&Lawler, J.J. (2007)Random forests for classification in ecology.Ecology, vol. 88, no. 11, pp. 2783–2792.

Danson, F.M., & Curran, P.J. (1993) Factorsaffecting the remotely sensed response ofconiferous forest plantations. Remote Sen-sing of Environment, vol. 43, pp. 55–65.

De’ath, G., & Fabricius, K. (2000) Classificationand regression trees: a powerful yet simpletechnique for ecological data analysis. Ecol-ogy, vol. 81, no. 11, pp. 3178–3192.

DeFries, R.S., Hansen, M., Steininger, M.,Dubayah, R., Sohlberg, R., & Townshend,J. (1997) Sub pixel forest cover in CentralAfrica from multisensor, multitemporaldata. Remote Sensing of Environment, vol.60, pp. 228–246.

Digital Globe. (2008)http://www.digitalglobe.com/index.php/85/QuickBird (accessed 10November 2009).

Dye, M., Mutanga, O., & Ismail, R. (2008)Detecting the severity of woodwasp, Sirexnoctilio, infestation in a pine plantation inKwaZulu-Natal, South Africa, using texturemeasures calculated from high spatial reso-lution imagery. African Entomology, vol. 16,no. 2, pp. 263–275.

Eastman, J.R., & Fulk, M. (1993) Long sequencetime series evaluation using standardizedprincipal components. Photogrammetric En-gineering and Remote Sensing, vol. 59, no. 6,pp. 991–996.

Elith, J., Leathwick, J.R., & Hastie, T. (2008) Aworking guide to boosted regression trees.Journal of Animal Ecology, vol. 77, pp. 802–813.

ENVI (2006) ENVI Version 4.3, ITT Industries,Inc., Boulder, CO.

ESRI (2006) ArcGIS Version 9.1, ERSI, Red-land, CA.

Franklin, S.E., Hall, R.J., Moskal, L.M., Maud-ie, A.J., & Lavigne, M.B. (2000) Incor-porating texture into classification offorest species composition from airbornemultispectral images. International Journalof Remote Sensing, vol. 21, no. 1, pp. 61–79.

Franklin, S.E., Wulder, M.A., & Gerylo, G.R.(2001) Texture analysis of IKONOS panchro-matic data for Douglas-fir forest age classseparability in British Columbia. InternationalJournal of Remote Sensing, vol. 22, no. 13, pp.2627–2632.

Franklin, S.E., Hall, R.J., Smith, L., & Gerylo,G.R. (2003) Discrimination of coniferheight, age and crown closure classes usingLandsat-5 imagery in the Canadian North-west Territories. International Journal ofRemote Sensing, vol. 24, no. 9, pp. 1823–1834.

Gebreslasie, M.T., Ahmed, F.B., & van Aardt, J.(2008) Estimating plot-level forest structuralattributes using high spectral resolutionASTER satellite data in even-aged Eucalyp-tus plantations in southern KwaZulu-Natal,South Africa. Southern Forests, vol. 70, no.3, pp. 227–236.

Journal of Spatial Science 209

Page 19: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Gerylo, G.R., Hall, R.J., Franklin, S.E., &Smith, L. (2002) Empirical relations betweenLandsat TM spectral response and foreststands near Fort Simpson, Northwest Terri-tories, Canada. Canadian Journal of RemoteSensing, vol. 28, no. 1, pp. 68–79.

Grimm, R., Behrens, T., Marker, M., & Else-nbeer, H. (2008) Soil organic carbon con-centrations and stocks on Barro ColoradoIsland – Digital soil mapping using RandomForests analysis. Geoderma, vol. 146, pp.102–113.

Hall-Beyer, M. (2010) The GLCM Tutorial HomePage, University of Calgary, Canada. http://www.ucalgary.ca (accessed 12 March 2010).

Hansen, M.C., DeFries, R.S., Townshend,J.R.G., Sohlberg, R., Dimiceli, C., & Car-roll, M. (2002) Towards an operationalMODIS continuous field of percent treecover algorithm: examples using AVHRRand MODIS data. Remote Sensing of En-vironment, vol. 83, pp. 303–319.

Haralick, R.M., Shanmugan, K., & Dinstein, I.(1973) Texture features for image classifi-cation. IEEE Transactions on Systems,Man and Cybernetics, vol. 3, no. 6, pp.610–621.

Ismail, R. (2009) Remote sensing of forest health:the detection and mapping of Pinus patula treesinfested by Sirex noctilio. Dissertation (PhD).School of Environmental Sciences, Universityof KwaZulu-Natal, Pietermaritzburg, SouthAfrica.

Ismail, R., & Mutanga, O. (2010) A comparisonof regression tree ensembles: predicatingSirex noctilio induced water stress in Pinuspatula forest of KwaZulu-Natal, SouthAfrica. International Journal of AppliedEarth Observation and Geoinformation, vol.12, pp. 45–51.

Jakubauskas, M.E., & Price, K.P. (2000) Re-gression-based estimation of lodgepole pineforest age from Landsat Thematic MapperData. Geocarto International, vol. 15, no. 1,pp. 1–6.

Jensen, J.R., Qui, F., & Ji, M. (1999) Predictivemodelling of coniferous forest age usingstatistical and artificial neural network ap-proaches applied to remote sensor data.International Journal of Remote Sensing,vol. 20, no. 14, pp. 2805–2822.

Jobanputra, R., & Clausi, D.A. (2006) Preser-ving boundaries for image texture segmenta-tion using grey level co-occurringprobabilities. Pattern Recognition, vol. 9,pp. 234–245.

Johansen, K., Coops, N.C., Gergel, S.E., &Stange, Y. (2007) Application of highspatial resolution satellite imagery for ripar-ian and forest ecosystem classification. Re-mote Sensing of Environment, vol. 110, pp.29–44.

Kayitakire, F., Giot, P., & Defourny, P. (2002)Automated delineation of the foreststands using digital color orthophotos:Case study in Belgium. Canadian Journal ofRemote Sensing, vol. 28, no. 5, pp. 629–640.

Kayitakire, C., Hamel, C., & Defourny, P.(2006) Retrieving forest structure based onimage texture analysis and IKONOS-2 ima-gery. Remote Sensing of Environment, vol.102, pp. 390–401.

Kuhn, K. (2009) Variable selection using thecaret package. http://ftp.udc.es/public/CRAN/web/packages/caret/vignettes/caretSelection.pdf (accessed 21 November2009).

Lawrence, R.L., Wood, S.D., & Sheley, R.L.(2006) Mapping invasive plants using hyper-spectral imagery and Breiman Cutler classi-fications (RandomForest). Remote Sensingof Environment, vol. 100, pp. 356–362.

Liaw, A., & Wiener, M. (2002) Classification andregression by randomForest. R News, vol. 2/3, pp. 18–22.

Lobell, D.B., Oritiz-Monasterio, J.I., Asner,G.P., Naylor, R., & Falcon, W. (2007)Combining field surveys, remote sensingand regression trees to understand yieldvariations in an irrigated wheat landscape.Agronomy Journal, vol. 97, pp. 241–249.

Michaelson, J., Schimel, D.S., Friedl, M.A.,Davis, F.W., & Dubayah, R.O. (1994)Regression tree analysis of satellite andterrain data to guide vegetation samplingand surveys. Journal of Vegetation Science,vol. 5, pp. 673–696.

Moskal, L.M., & Franklin, S.E. (2001) Classify-ing multilayer forest structure and composi-tion using high resolution, compact airbornespectrographic imager image texture. Amer-ican Society of Remote Sensing and Photo-grammetry Annual Conference, St. Louis,April, 2001.

Mucina, L., & Rutherford, M.C. (eds.) (2006)The Vegetation of South Africa, Lesotho andSwaziland. Strelitzia 19, South African Na-tional Biodiversity Institute, Pretoria.

Nieman, K.O. (1995) Remote sensing of foreststand age using airborne spectrometer data.Photographic Engineering and Remote Sen-sing, vol. 61, pp. 1119–1127.

210 M. Dye et al.

Page 20: Africa of Pinus patula forests in KwaZulu-Natal, South ...iks.ukzn.ac.za/sites/default/files/Combining... · Natal, South Africa. The ability to map forest age using remote sensing

Owen, D.L. (ed.) (2000) Southern AfricanForestry Handbook, Volume 1, The SouthernAfrican Institute of Forestry.

Prasad, A.M., Iverson, L.R., & Liaw, A. (2006)Newer classification and regression treetechniques: bagging and random forests forecological prediction. Ecosystems, vol. 9, pp.181–199.

Puissant, A., Hirsch, J., & Weber, C. (2005) Theutility of texture analysis to improve per-pixel classification for high to very highspatial resolution imagery. InternationalJournal of Remote Sensing, vol. 26, no. 4,pp. 733–745.

R Development Core Team. (2008) R: ALanguage and Environment for StatisticalComputing, R Foundation for StatisticalComputing, Vienna.

Ricotta, C., Avena, G.C., & Volpe, F. (1999)The influence of principal component analy-sis on the spatial structure of a multispectraldataset. International Journal of RemoteSensing, vol. 20, no. 17, pp. 3367–3376.

Schmidt, K.S. (2003) Hyperspectral RemoteSensing of Vegetation Species Distribution inSaltmarsh, International Institute for Geo-Information Science and Earth Observation,Enschede.

Spatial Ecology. (2010) Hawth’s Analysis Toolsfor ArcGIS. www.spatialecology.com/htools/overview.php (accessed 2 April 2010).

St-Louis, V., Pidgeon, A.M., Radeloff, V.C.,Hawbaker, T.J., & Clayton, M.K. (2006)High resolution image texture as a predictorof bird species richness. Remote Sensing ofEnvironment, vol. 105, pp. 299–312.

Strobl, C., & Zeileis, A. (2008) Danger: highpower! – exploring the statistical propertiesof a test for random forest variable impor-tance. Proceedings in Computational Statis-tics, vol. 2, pp. 59–66.

Tomppo, E.O., Gagliano, C., De Natale, F.,Katila, M., & McRoberts, R.E. (2009)Predicting categorical forest variables usingan improved k-Nearest Neighbour estimatorand Landsat imagery. Remote Sensing ofEnvironment, vol. 113, pp. 500–517.

Tuttle, E.M., Jensen, R.R., Formica, V.A., &Gonser, R.A. (2006) Using remote sensingimage texture to study habitat use patterns: acase study using the polymorphic white-throated sparrow (Zonotrichiaalbicollis).Global Ecology and Biogeography, vol. 15,pp. 349–357.

van Aardt, J.A.N., & Norris-Rogers, M. (2008)Spectral-age interactions in managed, even-aged Eucalyptus plantations: application ofdiscriminant analysis and classification andregression trees approaches to hyperspectraldata. International Journal of Remote Sen-sing, vol. 29, pp. 1841–1845.

Wunderle, A. L., Franklin, S.E., & Guo, X.G.(2007) Regenerating boreal forest structureestimation using SPOT-5 pan-sharpenedimagery. International Journal of RemoteSensing, vol. 28, no. 19, pp. 4351–4364.

Yuan, X., King, D., & Vleck, J. (1991) Sugarmaple decline assessment based on spectraland textural analysis of multispectral aerialvideography. Remote Sensing of Environ-ment, vol. 37, pp. 47–54.

Zhang, C., Franklin, S.E., & Wulder, M.A.(2004) Geostatistical and texture analysis ofairborne-acquired images used in forestclassification. International Journal of Re-mote Sensing, vol. 4, pp. 859–865.

Journal of Spatial Science 211