technical papers - grupos.unileon.esgrupos.unileon.es/ingecart/files/2013/05/jse_2011a.pdf · john...

Technical Papers113 Deformation Assessment Considering an A Priori Functional Model in

a Bayesian FrameworkBarbara Betti, Noemi Emanue/a Cazzaniga, and Vincenza Tornatore

120 An Improved Weighted Total Least Squares Method with Applieationsin Linear Fitting and Coordinate TransformationXiaohua Tong, Yanmin Iin, and Lingyun Li

129 MINQUE of Varianee-Covarianee Components in Linear Gauss-MarkovModelsPeng Junhuan, Shi Yun, Li Shuhui, and Yang Honglei

140 Machine Learning Teehniques Applied to the Assessment of GPSAeeuraey under the Forest CanopyCelestino Ordóñez, José R_ Rodríguez-Pérez, Juan J. Moreira,J. M. Matías, and Enoe Sanz-Ablanedo

150 Deteetion of Outliers in GPS Measurements by Using Functional-DataAnalysisC. Ordoñez, J. Martínez, J. R. Rodríguez-Pérez, and A. Reyes

Case Studies156 Evaluation of GPSjGalileo RTK Network Configuration: Case Study

in GreeceMaria Tsakiri

Technical Notes167 Seeond Order Design of Geodetie Networks by the Simulated

Annealing MethodSergio Baselga

Book Reviews174 Review of CORS and OPUS for Engineers. Tools for Surveying and

Mapping Applications, edited by T. SolerJohn V. Hamilton

Reviewers175 Reviewers

Geomatics Division

There exist many GNSS applications in forestry: GPS receiversare used to navigate forested areas (Mancebo and Chamberlain2000), make maps to different scales, take forest inventory(Murphy et al. 2006), make area and perimeter estimates (Tachikiet al. 2005), manage forestry machinery (McDonald and Carter2002; McDonald and Fulton 2005), characterize forest roads(Martin et al. 2001), and more. GPS is the main tool used for pre-cision forestry, i.e., planning and implementing site-specific forestmanagement activities and .operations aimed at improving woodproduct quality and use, reducing waste, increasing profits, andmaintaining the quality of the environment (Taylor et al. 2006).Precision and accuracy requirements depend on the forestry appli-cation; for example, precision in meters is sufficient to delirnit largeforests, whereas precision in decimeters and even centimeters isrequired for individual trees.

The receiver, thus, needs to be selected in accordance withthe type of study performed. Code-phase differential GPS (DGPS)receivers (errors of 2-5 m) are wide1y used for most forestryapplications, whereas carrier-phase DGPSs are required for preci-sion forestry applications.

To assess GPS observation quality according to tree mastype, the factors that condition positioning precision and accuracyunder a forest canopy need to be identified. Research in this fieldhas focused essentially on identifying dasometric variables (treediameter and height, treetop height, plantation density, and woodvolume) and on comparative studies of measurement equipmentand methods. Sawaguchi et al. (2003) demonstrated that forest typeand antenna height condition positioning precision in differentialmode. Basal area is one of the dasometric variables with the great-est bearing on precision for both pseudorange and carrier-phase po-sitioning (Naesset 2001; Naesset and Jonmeister 2002). Yoshimuraand Hasegawa (2003) concluded that the plantation structurealso conditions positioning accuracy and precision. Other studieshave focused on modeling error by using logistic regression teevaluate the probability of resolving ambiguity (Naesset et al. 2000:Hasegawa and Yoshimura 2003). Monte Carlo simulation has also

Machine Learning Techniques Applied to the Assessmentof GPS Accuracy under the Forest Canopy

Celestino Ordóñez ': José R. Hodrlquez-Pérez''; Juan J. Moreira''; J. M. Matías": and Enoe Sanz-Ablanedo"

Abstraet: The geographic location of points using global positioning system (GPS) receivers is less accurate in forested environments thanin open spaces because of signalloss and the multipath effect of tree trunks, branches, and leaves. This has been confirmed in studies that haveconcluded that a relationship exists between measurement accuracy and certain variables that characterize forest canopy, such as tree density,basal area, and biomass volume. However, the practical usefulness of many of these studies is lirnited because they are often lirnited to

- describing associations between the variables and mean errors in the measurement interval, when measurements should be made in realtime and in intervals of seconds. In this work, machine learning techniques were applied to build mathematical models that would associateobservation error and GPS signal and forest canopy variables. The results reveal that the excessive complexity of the signal prevents accuratemeasurement of observation error, especially in the Z-coordinate and in time intervals of a few seconds, but also reveal that forest covervariables have a significantly greater influence than GPS factors, such as position dilution of precision (PDOP), the number of satellites, orerror for the base station. DOI: 10.1061/(ASCE)SU.1943-5428.0000049. © 2011 American Society of Civil Engineers.

CE Database subjeet headings: Global positioning systems; Neural networks; Satellites; Surveys; Forests.

Author keywords: Global positioning; Neural networks; Satellite; Surveys; Forests.

Introduction

Global navigation satellite system (GNSS) and particularly globalpositioning system (GPS) technologies are widely used in landsurveying for engineering, mapmaking, and cadastral purposes.They are also used in forestry applications. GNSS receivers locatedin open spaces and operating in a differential positioning mode canmeasure the coordinates to a point with rnillimeter accuracy. Undera forest canopy, however, attenuation (Yoshimura and Hasegawa2003) and temporary loss of lock of the GPS signal (Hasegawaand Yoshimura 2007) affect GPS positioning precision and accu-racy. Nevertheless, it is possible to achieve an accuracy of betterthan 0.1 m under the forest canopy by using real-time kinematic(RTK) processing (Bakula et al. 2009), although the equipmentneeds to be reinitiated frequently to achieve redundancy in theobservations.

IAssociate Professor, Dept. of Environmental Engineering, Univ. ofVigo, Rúa Maxwell s/n, Campus Lagoas-Marcosende, 3631O-Vigo,Pontevedra, Spain. E-mail: [email protected]

2Associate Professor, Geomatics Engineering Research Group, Univ. ofLeón, Avda. Astorga s/n, 24400 Ponferrada, León, Spain (correspondingauthor). E-mail: [email protected]

3Ph.D. Student, Dept. of Environmental Engineering, Univ. of Vigo,Rúa Maxwell s/n, Campus Lagoas-Marcosendc, 3631O-Vigo, Pontevedra,Spain. E-mail: [email protected]

"Asscciate Professor, Dept. of Statistics, Univ. of Vigo, CampusUniversitario As Lagoas-Marcosende s/n, 36310 Vigo, Pontevedra, Spain.E-mail: [email protected]

5Assistant Professor, Geomatics Engineering Research Group, Univ. ofLeón, Avda. Astorga s/n, 24400 Ponferrada, León, Spain. E-mail: [email protected]

Note. This manuscript was submitted on June 22, 2010; approved onNovember 28, 2010; published online on December 9, 2010. Discussionperiod open until April 1, 2012; separate discussions must be submittedfor individual papers. This paper is part of the Journal of Surveying En-gineering, Vol. 137, No. 4, November 1, 2011. ©ASCE, ISSN 0733-9453/201114-140-149/$25.00.

140/ JOURNAL OF SURVEYING ENGINEERING © ASeE / NOVEMBER 2011

been used to estimate the relationship between satellite signal andforest structure (Sawaguchi et al. 2005). Other researchers havefocused on evaluating accuracy and precision measurementsobtained by using different types of GNSS receivers (Manceboand Chamberlain 2000; Rodríguez-Pérez et al. 2007), comparingdifferent kinds of positioning (Naesset and Joinmeister 2002;Gegout and Piedallu 2005), and comparing GPS and the GNSS(Naesset et al. 2000, 2001). Experiments indicate that the most ac-curate positioning is obtained in differential mode and by usingdual-frequency receivers, although, there exist problems with thelatter in resolving ambiguities because of cycle losses in the satel-lite signal (Hasegawa and Yoshimura 2007).

The results of these assessments cannot be compared becauseof the lack of standards, best practices, and systematic methods toassess GNSS measurements under forest canopy. To compare sim-ilar work, defining standards will have to address the assessmentof the whole receiver, including traceability, methodology, termi-nology, acquisition, preprocessing (Beraldin et al. 2007), and theequations to calculate accuracy and precision.

The aim of our research was to study the usefulness of machinelearning techniques for estimating GPS observation errors in for-estry environments and to evaluate the effects of forest cover onGPS positioning accuracy and precision.

Materials and Methods

Data Collection and Calculation of the Errors

The experimental points were located in Sancedo (a province ofLeón) in the northwest of Spain (Fig. 1). Observations were madeat 12 different points located in areas planted with Pinus radiataD. Don. Measurements were also made at a reference point located

in a clearing so that the coordinates for the 12 experimental pointscould be calculated.

The data were collected by using two dual-frequency GPS receiv-ers (Hiper Plus, Topcon Positioning Systems, Inc., Livermore, CA)that observed GPS pseudorange and carrier phase. The data forthe 12 experimental points were collected over a 4-day period(August 20-23, 2007). Measurements were made at each experi-mental point for at least 1.5 h, and the data were reviewed to ensurecontinuity. The data were reduced to obtain 12 data sets that rep-resented data collection over a period of 1 h. Antenna heightsranged from 1.45 to 1.60 m, and the logging rate was 1 s.

The position of the reference point was obtained at the sametime as the position of the experimental point for each of the12 experimental points. The coordinates for the reference pointwere 42°41'08.79872"N, 6°38'03.210587"W (latitude-longitude,WGS84), and ellipsoidal height was 933.829 m. These coordinateswere calculated by differential correction by using data from thePonferrada base station (the nearest reference station in the regionalGNSS network: http://gnss.itacyl.es/). The reference point was pro-jected by setting up the following (easting, northing) coordinates:(693,814.623, 4,728,635.531) m (Universal Transverse Mercator(UTM)-datum ETRS89; zone 29N). The reference point coordi-nates were used to calculate the coordinates assumed as true of theexperimental points by differential correction (the coordinates areshown in Table 1).

Next, the accuracies obtained were deterrnined for every secondobservation in each plot. The Joint Committee for Guides inMetrology (JCGM) defined accuracy as the closeness of agreementbetween a measured quantity value and a true quantity value ofa measurement (JCGM 2008). In this work, horizontal accuracyand vertical accuracy were calculated for each sample by using thefollowing equations:

/ / -) / // t ; '- <,

110 220! 1 I I !

440 km! I

= Road/Track.\

------ Contour line ) \-J -;./¡ Forest (P. radiata) \ ,

!ttL-".._.".- "'-=-~"..,(I i

Fig. 1. Location of the experimental points and the reference point in Sancedo (León, Spain)

JOURNAL OF SURVEYING ENGINEERING © Ases / NOVEMBER 2011/141

..-...

Table 1. Accurate Coordinates, Trueness, and Precision for the Experimental Points and the Reference Point (ETRS89, UTM29N)

Point Easting (X) Northing (Y) Ellipsoidal height Horizontal trueness Vertical trueness Horizontal precision Vertical precision

POI 696,104.03 4,729,765.54 1,013.13 1.83 6.80 0.77 2.40

P02 696,058.33 4,729,778.21 1,013.72 4.21 3.68 1.79 2.29P03 696,078.14 4,729,740.30 1,013.97 2.39 4.91 1.14 3.50P04 693,652.72 4,728,368.28 913.64 1.74 4.21 0.91 2.60P05 693,696.85 4,728,436.42 918.90 3.87 4.18 1.58 2.36P06 693,673.89 4,728,454.87 921.66 2.12 7.13 0.99 3.64P07 696,068.87 4,729,584.83 1,007.37 4.81 5.21 2.50 3.20P08 696,052.45 4,729,566.48 1,008.89 3.78 4.42 1.95 3.33P09 695,566.00 4,728,784.62 989.02 2.07 8.30 1.06 4.40

PlO 694,450.03 4,728,474.45 954.21 2.85 5.49 1.45 2.76P11 694,457.89 4,728,452.08 955.89 2.01 10.02 1.20 5.83P12 694,478.74 4,728,446.77 954.93 3.12 5.71 1.52 3.04RP 693,814.62 4,728,635.53 933.83 1.32 2.33 0.54 1.42

Hace = V(E¡ - Etrue? + (N¡ - Ntrue)2 (1)

Vaee = IZ¡ - Ztruel (2)

in which Hace and Vaee indicate horizontal and vertical accuracy,respectively; E¡, N¡, and Z¡ = positions measured at the ith second;and Etrue, Ntrue, and Ztrue = coordinates assumed as true in theeasting, northing, and ellipsoidal height directions, respectively.

The measurement trueness is defrned as the closeness of agree-ment between the average of an infinite number of replicatemeasured quantity values and a reference quantity value (JCGM2008), whereas the precision is the closeness of agreement betweenindications or measured quantity values obtained by replicatemeasurements on the same or similar objects under specified con-ditions (JCGM 2008). Precision calculations were made for differ-ent time intervals n measured in seconds (with n = 60, 120, 180,240, or 300 s). Calculating precision values simplifies the outputfunction, and this, theoretically, makes it easier to estimate thefunction by using machine learning. Table 1 shows the trueness(calculated by using mean values) and the precision (calculated byusing standard deviation values) for both the horizontal and thevertical measurements.

Variables Characterizing the Forestry Canopy

To characterize each tree mass, the trees in each plot in a radiusof 10 m around the experimental point were measured, and thefollowing dasometric variables were calculated: arithmetic meandiameter, mean height, dorninant height, treetop height, tree den-sity, basal area, quadratic mean diameter, Hart-Becking index,wood volume, and biomass.

The arithmetic mean diameter (dm) is deterrnined by using thefollowing equation:

dl',d

m==-Ln

(3)

in which d¡ = normal diameter (measured at a height of 1.3 m) foreach tree; and n = total number of trees in the plot.

Mean height ha corresponds to the arithmetic mean of the treeheights

ha = ¿h¡n

(4)

in which h¡ = tree height; and n = number of trees measured ineach plot.


Dorninant height (Ho) was calculated as the mean height of thefour largest trees in each plot.

Treetop height (hJ was deterrnined by calculating the differencebetween total height and height to the first branch.

Tree density (N) was calculated as the number of trees perhectare.

Basal area (G) reflects the cross-section area at the normalheight of the trees in the plot

~

1iG= -d4 L

The quadratic mean diameter (dg) measures tree diametercompared to mean basal area

dg

= f4GV-;;;The Hart-Becking index (HE!) is defined as the relationship

between mean spacing between trees (a) and dorninant height (Ho),and HE! is related to the number of trees per hectare (N)

HE! = ~ 100 = 1O~ 100HHo HoyN

To calculate wood volume (V) for individual trees, we used aspecific formula for Pinus radiata (Castedo-Dorado et al. 2009)

V = (0.000081d¡7746ho.97li)N

Biomass (W) was calculated by using models developed byBalboa et al. (2006), who distinguish between the followingcomponents: wood (Wm), bark (Wc)' large branches (Wrg), smallbranches (Wr¡), twigs (Wram), and needles (Wac)

W = (Wm + Wc + Wrg + Wmrf + Wram + WaJN

= (0.0123d)6042hI.4131 + 0.0036dl·26564

+ (1.37699 + 0.001065dfh) + 0.0363d¡-6091h-o.9417

+ 0.0078d¡1.9606 + 0.0423d¡7141)N (9)

Finally, the slendemess coefficient (C) expresses the relation-ship between mean height and mean diameter of the mass

C= hadm

(5)

(6)

(7)

(8)

Table 2. Dasometric Variables for Each Plot

Plot dm (m) h.; (m) Ho (m) he (m) N G (m2/ha) dg (m) HBl (%) V (m3/ha) W (kg/ha) ePOI 0.18 14.48 16.05 9.58 764 18.73 0.18 22.54 124.85 57,717 82.51P02 0.20 16.42 17.05 10.85 732 23.65 0.20 21.68 182.46 83,696 81.64P03 0.20 14.86 15.70 9.94 764 24.68 0.20 23.04 190.21 87,207 73.86P04 0.28 24.58 26.48 17.08 605 38.28 0.28 15.36 385.36 179,483 88.24P05 0.29 27.40 27.65 19.14 668 45.94 0.30 13.99 473.19 220,913 94.30P06 0.28 25.55 25.93 16.55 509 31.59 0.28 17.10 315.00 146,441 92.27P07 0.14 15.92 15.00 9.12 2,037 37.00 0.15 14.77 236.55 114,336 111.69P08 0.16 18.50 18.95 10.10 1,751 39.83 0.17 12.61 278.51 131,151 114.48

P09 0.15 15.15 15.28 9.81 1,814 32.49 0.15 15.37 191.13 91,907 103.58PlO 0.14 21.78 21.95 11.49 3,056 59.87 0.16 8.24 442.67 216,048 155.76PlI 0.13 20.55 22.05 11.79 2,960 53.27 0.15 8.34 379.54 188,170 153.78P12 0.12 22.48 22.28 12.06 2,992 46.22 0.14 8.21 312.87 157,546 182.16

Regression Models

To calculate GPS observation accuracy according to dasometricvariables and GPS variables and to evaluate the influence of each,we used three different machine learning techniques, each brieflydescribed subsequently. The models obtained by using these tech-niques are all nonlinear models. Other simpler models tested (suchas univariate and multivariate regression) produced unsatisfactoryresults because they were incapable of establishing relationshipsamong accuracy and the different explanatory variables. This isthe main reason why they are not mentioned in this section; in ad-dition, they are well-known methods that have been discussed insimilar research.

The descriptions of the machine learning techniques used inthis research are, out of necessity, brief. Readers interested in morein-depth explanations can look up the references provided.

Measurement points were selected to have groups with inter-nally similar dasometric variables that were considerably differentfrom the other groups. Hence, Points 1, 2, and 3 referred to areaswith substantially less forest cover and Points 10, 11, and 12referred to areas with substantially more forest cover (see Table 2),bearing in mind variables such as N, W, G, and HB!.

GPS Signal Variables

In addition, we recorded a series of variables for each measurementsecond that condition observation accuracy independently of treecanopy. These variables are listed as follows:• PDOPp: Position dilution of precision (PDOP) for each experi-

mental point under forest canopy;• PDOP,.: PDOP for the reference point;• ax : Error X for the reference point;• ay'-"': Error Y for the reference point;• ax~'cc: Error XY for the reference point;• az '-'~Error Z for the reference point;• E/"Mean elevation angle for the satellites transrmttmg the

signal for the experimental point under the forest canopy;• E: Mean elevation angle for the satellites transmitting the signal

received by the reference point;• DLLSNCAp: Indicator of the signal-noise ratio in coarse/

acquisition (CA) code (in dB • Hz) for the experimental pointunder the forest canopy;

• DLLSNCAr: Indicator of the signal-noise ratio in CA code(in dB • Hz) for the reference point;

• nCAp: Number of satellites receiving CA code for the experi-mental point under the forest canopy;

• nCAr: Number of satellites receiving CA code for the refer-ence point;

• DLLSNLlp: Indicator of the signal-noise ratio in P code for Ll(in dB • Hz) for the experimental point under the forest canopy;

• DLLSNLl,.: Indicator of the signal-noise ratio in P code for Ll(in dB • Hz) for the reference point;

• nLlp: Number of satellites receiving code in the Ll carrier forthe experimental point under the forest canopy;

• ul L: Number of satellites receiving code in the L1 carrier forthe reference point;

• DLLSNL2p: Indicator of the signal-noise ratio in P code for L2(in dB • Hz) for the experimental point under the forest canopy;

• DLLSNL2r: Indicator of the signal-noise ratio in P code for L2(in dB • Hz) for the reference point;

• nL2p: Number of sate1lites receiving code in the L2 carrier forthe experimental point under the forest canopy; and

• nL2r: Number of satellites receiving code in the L2 carrier forthe reference point.

Support Vector Regression

Support vector regression (SVR), which was proposed by VladimirVapnik and others to solve regression problems (Drucker et al.1997), is based on concepts developed previously for cJassificationproblems and gives rise to support vector machines (SVMs)(Vapnik 1995). Given a training set of 1 samples (x¡,z¡), withX¡ E R" and Z¡ E R, the goal is to find a functionf(x) with a maxi-mum of one deviation 10 with respect to the values Zi observed forthe entire training set that is also as minimally complex as possible.

If we initially consider a linear function

f(x)=(w,x)+b with w e R", (11)bER

in which (w, x) denotes internal product in R, the problem consistsof finding w and b that satisfy the previously mentioned goal.

In this case, the condition of minimum complexity impliesobtaining w with a minimal norm IIwl12 = (w, w). The conditionthat the deviation of the function with respect to Z¡ should be lessthan 10 V i, may not be feasible in some cases. For this reason, slackvariables ~, e are introduced to obtain an optimization problemthat is computationally solvable.

Regression with SVR can be posed as an optimization problemwith constraints (Smola and Scholkopf 2004):

1 1T(W,~, ~*) = 2:"wI2 + e¿)~+ ~*) (12)

i=l

subject to

JOURNAL OF SURVEYING ENGINEERING © xsce / NOVEMBER 2011/143

z· - (w x.) - b < e + c,,l '1 - ~l

(w, Xi) + b - Z¡ :$ e + ~i ~,e~O

(13)

(14)

The constant e > o determines the compromise between thecomplexity of J and model accuracy.

To generalize to a nonlinear regression, we used a method basedon transforming the original space to a larger dimension spacein such a way that linear solutions in the new space give rise tononlinear solutions in the original space. Defming a positive def-inite function k (kemel) such that the chosen transformation isrfJ: R" -+ K, and letting

rfJ(x) . rfJ(x') = LrfJ;(x)rfJ¡(x') = k(x, x') (15)

it is demonstrated that this function k is a well-defined scalarproduct (Boser et al. 1992).

Therefore, introducing the Lagrange multipliers a and a*, thefunction to be minimized can be expressed as

Table 3. Means and Standard Deviations for R2 Obtained for HorizontalAccuracy with Just One Explanatory Variable

Variable (fR2

SVR

R2

MLP

17R' R2

RBFaR2 R2

CE

HB!

V

HoNGhaW

dg

hedm

DLLSNCApDLLSNCAR

DLLSNLlpnCAr

»r.r,DLLSNL2pPDOPp

nL2pnr.r,ErO"XYr_acc

DLLSNL2r

DLLSNLlrayr_acc

PDOPr

nCApCJZr_xc

Ep

«c.:

0.75720.75240.75210.75100.75010.74830.74670.74530.74480.74460.58280.50570.40500.38080.37740.36940.35580.34260.33090.33090.32780.30660.29050.28620.28270.27040.26810.26410.24570.2304

0.13990.13850.13450.14820.12910.12780.15050.12770.12770.12740.24460.28600.29080.27840.22260.21850.29660.23080.25660.25660.27100.26070.22170.22000.26850.22600.25920.20680.21850.2181

0.67450.69190.68570.64780.67810.67900.66530.67120.53040.64500.51220.48110.27880.22370.28280.31820.20010.07400.14670.12040.27240.20320.20810.18710.15810.17030.09740.17160.21210.0992

0.22300.19460.20820.24240.22800.23860.24200.23640.29030.25680.30980.25910.23620.25680.18250.20870.24870.13400.22590.18030.20270.19260.20540.18980.16070.15340.12510.17860.23790.1419

0.57560.57560.57560.57560.57070.57560.57560.57560.57560.57560.43840.00380.18730.19290.20170.07340.13880.08150.10720.10720.17690.12330.15480.19180.10880.01120.03310.08580.11120.0570

0.26890.26890.26890.26890.26120.26890.26890.26890.26890.26890.29670.01690.22010.26300.19260.15510.22340.16560.20750.20750.17170.19190.20710.23520.15620.05000.08940.17980.18170.0930

l 1W(a,a*) = 2' (a - a*f Q(a - a*) + e L(O:¡ + o:j)

;=1

subject to1

L(O:; - 0:;) = O,;=1

in which

1

+ LZ¡(O:; - 0:;);=1

0:$ 0:;, 0:; s e,

Qij = K(x¡, Xj) == c.p(X¡)T . c.p(Xj)

(16)

i=l, ... ,l

(17)

(18)

The estirnated regression function at a given point is,therefore,

1

J(x) = L(O:; - o:¡)k(x¡,x) + b;=1

(19)

in which b is obtained from the fact that Eq. (13) is converted intoequality with ~i = O if 0:$ 0:; :$ e, and Eq. (14) is converted

Table 4. Means and Standard Deviations for R2 Obtained for VerticalAccuracy witb Just One Explanatory Variable

Variable R2

SVR

R2

MLP

17R' R2

RBF17R' (lR2

CE

HB!

V

HoN

G

haWdg

hedm

DLLSNCAp

DLLSNCAr

DLLSNLlpnCA,.nl.l ,DLLSNL2pPDOPp

nL2pnLlpErO"XYr_acc

DLLSNL2r

DLLSNLlrG"yr_acc

PDOPr

nCAp

aZr_acc

s,


«c.:

0.64100.64270.64110.64100.61120.64550.64330.64490.62150.64100.64130.28510.34310.32420.26720.26960.24180.19890.29050.29050.25890.33280.31550.19960.25360.28030.26730.35010.35010.3252

0.22230.22570.22240.22230.25010.18580.16510.22390.22520.22230.16290.27750.26730.31070.26880.26750.19580.20300.28360.28360.29360.28060.28260.21620.26850.30360.16810.25440.23910.2804

0.57950.62680.56640.63390.51370.60560.63510.56370.59470.63380.58880.08060.21300.20170.19220.23100.08810.13420.25250.20250.10160.12370.18640.13960.07800.03680.14500.14330.22020.1153

0.22920.20550.21810.19440.25840.22550.16370.27620.20660.20050.24150.08190.18860.19260.17070.20370.13600.14950.20290.20040.12040.15070.16000.15090.10460.11980.18040.19040.21880.1564

0.46820.46820.46820.46820.43510.46850.46820.46820.46820.46820.43940.06000.11190.12450.04030.15850.07440.05620.00800.00800.01640.13310.07660.07980.07970.07040.08160.16280.10900.1067

0.28500.2~500.28500.28500.26540.28500.28500.28500.28500.28500.28930.14850.14520.18230.11030.25540.08730.12960.03580.03570.07320.21710.13920.10840.13190.17190.14640.1878

0.16550.1534

0.795

0.79

0.785

0.78

0.775<=

0.77

0.765

0.76

0.755

0.75V

0.662

0.66

0.658

0.656

0.654D::0.652

0.65

0.648

0.646

0.644W

nCAp DLLSNCAp DLLSNCAr PDOPr

Fig. 2. Variation in R2 for horizontal accuracy (top) and verticalaccuracy (bottom) as explanatory variables are added to the model

6

5E~4I!!::1u .:;¡ 3 .~ 2N'Co::c 1

O5

R' = 0,0074

10 15 20Hart-Becklng Index

25

into equality with~; = OifO ::; a7 ::; C. This functionf(x) dependssolely on the training subsets for which ai, a7 =f. o. The elements ofthis subset are known as support vectors.

Multilayer Perceptron

The first modem neural model, proposed by McCulloch andPitts (1943), has served as the basis for many current models.The two architectures considered are multilayer perceptron(MLP) and the radial basis function network (RBFN), both ofwhich are based on a more general multilayer feedforward model(Haykin 1999).

MLP is a neura1 network formed by an input layer, anoutput layer, and one or more hidden 1ayers (Pinkus 1999). Eachneuron in layer j can be connected to all the neurons in layerj + 1 or to just a specific number of neurons. Its output is givenby the expression

(20)

Fig. 3. Scatter plot of the mean horizontal errors versus the Hart-Becking index for each of the 12 experimental points

0.74 ,---~~--~---~-----,

0.72

0.7

0.68

0.66

O:: 0.64

0.62

0.6

0.58

0.56

0.541 234

time interval (minutes)

in which x¡ = ¡th input; T = nonlinear activation function for R in R;and {Jo,(3¡, ... , {Jd = numeric parameters (the neuron weights).

The MLP output proposed in this case takes the following gen-eral form (activation function T is a sigmoid with d neurons in theinput layer, h neurons in the hidden layer, and one neuron in theoutput layer):

h

f(x) = LCjUj + Coj=l

(21 )

in which co, C¡ , ... , ch = weights associated with the neurons in thehidden layer; and Uj,j = 1,2, ... , h = outputs for each neuron in thislayer expressed in Eq. (20).

To train the network, we used the Bayesian regularization algo-rithm, which adjusts the weights and values of the constants accord-ing to Levenberg-Marquardt optimization (MacKay 1992; Foreseeand Hagan 1997). This minimizes the combination of quadratic er-rors and weights and determines the correct combination producedby a network that generalizes well.

Radial Basis Function Network

The RBFN differs from MLP is several ways (Bishop 1995). Firstof all, it generally has a fixed architecture composed of one input

0.6 r-----,---~~--~-----,

0.55

0.5

O:: 0.45

0.4

0.35

5 2ft'- 3 4time interval (minutes)

5

Fig. 4. Variation in R2 for the SVR model as the time interval increases: horizontal accuracy (left) and vertical accuracy (right)

JOURNAL OF SURVEYING ENGINEERING © xsce / NOVEMBER 2011 /145

Fig. 6. Means of the R2 values and the confidence intervals (95%) for the horizontal accuracy (left) and the vertical accuracy (right) for the referencepoint, for the GPS signal variables

O,8,,---.~~~~~~~~~~~~~~~---.--,

0,7

0,6

0,1

0,5

0,4

0,3

0,2

0,0

-0,1' , , ,,'~~~~Z~~mI->~U~~<~~~~~o Uu...J...J...J...J

o zc:zczc:-o- Mean c.. ~ ~ ~::r:: ±O,95 Cont. Interval i5 i5 i5

Fig. 5. Means of the R2 values and the confidence intervals (95%) forthe horizontal error, for each dasometric variable, and for each GPSsignal variable, for all the experimental points

layer, one output layer, and one hidden layer. Second, the neuronsin the hidden layer have a radial basis function (RBF) as the acti-vation function. RBFs also have the peculiarity that they only de-pend on the distance from the argument to a specific center. In thiscase, we chose a Gaussian function as giving the best results

hj(x) = e-lIx-cjI12/rJ (22)

in which x = input vector; hj = output of the jth neuron in the hiddenlayer; and Cj and rj = center and the width of the correspondingradial function, respectively,

The operation in the output layer is represented by theexpression

11

Yk(X) = Lwkjhj(x) + bkj=1

(23)

in which Yk = response of the kth neuron in the output layer;Wkj = weight corresponding to the connection between the kthneuron in the output layer and the jth neuron in the hidden layer;and bk = a constant termo

0,501 I

0,45 f·····' . ;,+1" 1.:;=T+ ..¡0,40~""'-r-=F---=i=-- --·~I--I----I---I---·, ..---~

::::: ~~~Intm-._:-=:0,25 f······ ..... , ····I±+ ·1

O,20~' ·¡ ·-:-;-···I···i- : _+ ~ =T- :- ....;_ _~

0,15 ~-l.-+--L..L--L.:__¡_--L~__¡_~. ! I

0101 II PDOPr DllSNCA, DLlSNll, DUSN12r -o- Mean

E, aCA, nU, ou, ::r:: ±O,95 Cont. Inlarval

Results

Each of the machine leaming techniques previously described wasused to make an estimate of the horizontal and vertical accuracy ofthe observations. In this work, the sample was composed of a totalof 43,200 data observations. The goodness of fit for each modelwas measured by using the coefficient of determination R2, whosevalues range between O and l. R2, which indicates the fraction ofvariance in a GPS error explained by the model under consideration(1 represents a perfect fit), is calculated as follows:

R2 = -'-(S_SY_-_S_SE....:...)SSY

(24)

in which

SSY?=¡ (y¡ - y)2 and SSE;:"'¡ (y¡ - y¡)2 (25)

in which Y¡ = ith value for the variable to be predicted; y = variablemeasurement; and y¡ = prediction of the model for y¡.

A cross-validation procedure was used for the SVR method todetermine the type of kernel giving the best results. This kernel,based on RBFs, is defined as

k(u, v) = exp( -y*lu - v12) (26)

in which y = a parameter that weights the difference between uand v.

The network was initially trained by using just one explanatoryvariable each time, with a view to determine which variable yieldedmost information on the error. The network parameters were deter-rnined in each case by using 20-fold cross-validation. Table 3shows the means and standard deviations for R2 obtained on pre-dicting horizontal accuracy with each parameter separately and byusing each ofthe methods. Table 4 shows the same data for error inthe Z coordinate. In both tables, it can be observed that the methodproducing the best results was SVR and that the dasometric vari-ables for each plot yielded the best fit, with R2 values of more than0.7, mostly for horizontal accuracy.

By using various input variable combinations, the best resultswere again obtained for SVR, with values that depended on thetime interval considered. The procedure used was the following:we started with the variable with the highest value for R2, added thevariable producing the best fit for two explanatory variables to themodel, and added other variables successively until there was nofurther increase in R2. For a 5-min time interval (i.e., considering

0,451' ':! 1

0,401- ;........•.... -i- ..,. .I .....•..... -rr- ..; ; ~

0,35 ~-+-_·I- ...;.--T-·J--I----4·---I--;..--l

:::~~~~·EtJ·~T~,;.i0,20 [",'·_ .. 1 .. ·····1···...[ ..... 1.. · .. 1··-1 .....·1·······1. . ....

0,15f--'-f---+._'-±: -1·-1-,*·-1·-4,.--"---

0,10' ¡ ;; ¡ ; ¡. ,

PDOP, DUSNCA, DUSNU, DlISNl2, -<>- MeanE, oCA, .u, IU, ::r:: :10,95 Conf. Inte••• 1


mean error every 5 min as the dependent variable), the optimalparameters were e = 1, e = 0.05, and y = 0.1 for the XY error(horizontal accuracy) and e = 1000, e = 0.001, and y = 2 forthe Z error (vertical accuracy).

Fig. 2 shows the contribution of each variable to the total pre-cision of the model. It can be observed that the dasometric values

best explained most of the error. The contribution of the remainingparameters and variables was quite small. This was confirmed byapplying the same techniques to the errors at the reference point tolocate a model that related these errors with the GPS variables (theforest canopy variables, obviously, had no bearing on referencepoint error).

8 152 3 4 5 6 7 8 9 10 11 12 2 3 4 5 6 7 8 : 9 10: 11 12

7

6

10»5 ~o~ 4 ~:J :Jo o~ 3 ~

5.

O12 24 36 48 60 72 84 96 108 120 132 12 24 36 48 60 72 84 96 108 120 132

Intervals lritervalsHorizontal accuracy (SVR model) Vertical accuracy (SVR model)

2 3 4 5 6 7 8 9 10: 11 128

2 3 4 5 6 7 8 9 11 127

6

Fig. 7. Horizontal and vertical accuracies for the 12 experimental points and the reference point; the values for the 12 experimental points arerepresented consecutively;the solid line is assigned to the calculatedaccuracy and the dashed line to the estimated one, for each of the three regressionmodels; the calculated accuracy for the reference point is shown with a dotted line

10

12 24 36 48 60 72 84 96 108 120 132

IntervalsHorizontal accuracy (MLP model)

82 3 4 5 6 7 8 9 11 12

7

6

5»o~ 4:J()s 3

» 5oe:J 48« 3

O12 24 36 48 60 72 84 96 108 120 132

IntervalsVertical accuracy (MLP model)

152 3 4 5 6 7 8 9 10 11 12

JOURNAL OF SURVEYING ENGINEERING © ASeE / NOVEMBER 2011 / 147

0~-12~-2~4--36~-48~-60~-7~2--484~~96~~108~~12O~~132~--~

IntervalsHorizontal accuracy (RBF model)

10

12 24 36 48 60 72 84 96 108 120 132

IntervalsVertical accuracy (RBF model)

Table 5. Maximum R2 Obtained for Accuracy for Each of the 12 Pointsunder the Forest Canopy by Technique Used

SVR MLP RBF

Hace

v.,0.78690.6584

0.61090.5646

0.73930.6422

This error dependence on the forest canopy was not detected bysimpler models, such as linear regression or polynomial models.Fig. 3 shows a mean value for horizontal accuracy for each exper-imental point with reference to the HB! variable. There exists noobservable direct relationship between them. The same conc1usioncan be drawn regarding the remaining forest cover and GPS signalparameters and variables.

It can be concluded from the results that the influence on accu-. racy of the parameters associated directly with the GPS signalis low.

Error variability is important when trying to obtain a good fit.The fit improved considerably with longer time intervals becausethere exist fewer fluctuations in the output variable (Fig. 4).

To determine whether significant differences exist for the valuesfor R2 obtained by using 20-fold cross-validation, means were com-pared for each of the variables by using the t-test, with the resultsshowing that for the dasometric variables (which pro vide mostinformation on the model), only d.; differed from the rest. Fig. 5shows the R2 means and confidence intervals for horizontal error,for the dasometric variables and the GPS signal variables, and foreach experimental point under the forest canopy. It confirms thet-test, and the R2 obtained for d.; is significantly lower than theother variables. Therefore, with the exception of d"" the dasometricvariables explain the accuracy of the measures equally.

None of the analyzed dasometric variables directly measure fo-liage density, and Sigrist et al. (1999) concluded that, although theexistence of the canopy overhead may degrade positional precision,it is the foliage that plays the greatest role in signal reception. Forthe same kind of analysis made for accuracy at the reference point,it was concluded that there were no significant differences betweenthe different GPS signal variables (the dasometric variables do not,obviously, affect accuracy at this point). The results ofthe t-test areshown in Fig. 6. The best fit was for DLLSNCAr, in whichRz = 0.34, so the signal variables were insufficient to determineerror at this point. To corroborate this result, larger data sets areneeded; however, Yeh et al. (2010), who studied the quality andprecision of GPS positioning in reference stations, found no signifi-cant clock offset and multipath effects.

In Fig. 7, which shows the observed (solid line) and estimated(dashed line) precision values for 5-min time intervals, jumpscorresponding to the 12 experimental points are evident, althoughthey have been marked by using vertical dotted lines to facilitatetheir recognition. The dotted line corresponds to the precision ofthe reference point. As expected, the error for the reference point(dotted line) is considerably smaller than on points under the forestcanopy; therefore, as the statistic analysis has confmned, most ofthe error is because of the effect of vegetation on the GPS signal.The corresponding values for R2 are shown in Table 5. SVR was themodel with the best fit, although it was similar to the fit for the MLPnetwork.

Conclusions

We evaluated the influence of dasometric variables and the varia-bles associated with the GPS signal for the accuracy of observations


made with a GPS receiver under the forest canopy. The resultsobtained for the three machine learning techniques used pointedto the same conc1usion: accuracy and precision depend greatlyon the presence of forest cover and far less on GPS signal variables,such as PDOP, the number of satellites, or the signal-noise ratio.

A comparison of the coefficients of determination for eachdasometric variable that was analyzed revealed no significantdifferences between the variability explained by each variable, withthe sole exception of the mean diameter, for which lower coeffi-cients of determination were obtained. Therefore, any of these var-iables can be used to construct models to estimate error, although itshould be noted that other factors that were not inc1uded in themodel could affect accuracy.

The great variability in accuracy over time makes it complicatedto locate mathematical mode1s to estimate measurement error forthe kind of variables that were analyzed here, inc1uding complexnonlinear models, such as neural networks. This may explainwhy, in previous research using linear models, accuracy andprecision of the measurement interval are associated with a smallnumber of explanatory variables, with no information providedabout the fit, merely [through an analysis of variance (ANOVA)]on the significance of the variables. The results improved whenmean values for the time intervals of more than 1 s were usedbecause this reduced the complexity of the variable that was es ti-mated. The better fit was obtained by calculating precision valuesfor 5-min intervals (the longest intervals studied). The best fit forhorizontal accuracy was 80%; and for vertical accuracy, the best fitwas only 66%.

References

Bakula, M., Oszczak, S., and Pelc-Mieczkowska, R. (2009). "Performanceof RTK positioning in forest conditions: Case study." J. Surv. Eng.,135(3), 125-130.

Balboa, M., Rodríguez-Soalleiro, R., Merino, A., and Álvarez-González,J. G. (2006). "Temporal variations and distribution of carbon stocksin aboveground biomass of radiata pine and maritime pine pure standsunder different silvicultural alternatives." For. Ecol. Manage.,237(1-3), 29-38.

Beraldin, J. A., Blais, F., El-Hakim, S. F., Cournoyer, L., and Picard, M.(2007). "Traceable 3D imaging metrology: Evaluation of 3D digitizingtechniques in a dedicated metrology laboratory." Proc., 8th Con! onOptical 3-D Measurement Techniques, NRC Institute for InformationTechnology; National Research Council Canada, Ottawa, ON, Canada.

Bishop, C. M. (1995). Neural networks and pattem recognition, OxfordUniv., New York.

Boser, B., Guyon, L, and Vapnik, V. (1992). "A training algorithm foroptimal margin classifiers." Prac., 5th Annual ACM Workshop on Com-putational Leaming Theory, Association for Computing Machinery(ACM), New York, 144-152.

Castedo-Dorado, F., Crecente-Campo, F., Álvarez-Álvarez, P., andBarrio-Anta, M. (2009). "Dcvelopment of a stand density managementdiagram for radiata pine stands including assessment of stand stability."Forestry, 82(1), 1-16.

Drucker, H., Burges, J. C., Kaufman, L., Smola, A., and Vapnik, V. (1997)."Support vector regression rnachines." Proc., 1996 Con! NIPS: Advan-ces in Neural lnformation Processing Systems 9, MIT, Cambridge, MA,155-161.

Foresee, F. D., and Hagan, M. (1997). "Gauss-Newton approximation toBayesian leaming." Proc., Int. Joint Con! on Neural Networks, IEEE,New York, 1930-1935.

Gegout, J. C., and Piedallu, C. (2005). "Effects of forest environmentand survey protocol on GPS accuracy." Photogramm. Eng. RemoteSens., 71(9), 1071-1078.

Hasegawa, H., and Yoshimura, T. (2003). "Application of dual-frequencyGPS receivers to static surveying under tree canopies." J. For. Res.,8(2), 103-110.

.4...

Hasegawa, H., and Yoshimura, T. (2007). "Estirnation of GPS positionalaccuracy under different forest conditions using signal interruptionprobability." J. For. Res., 12(1), 1-7.

Haykin, S. (1999). Neural networks. A comprehensive foundation, 2nd Ed.,Prentice Hall, Upper Saddle River, NJ.

Joint Committee for Guides in Metrology (JCGM). (2008). Internationalvocabulary of metrology (VIM)-Basic and general concepts andassociated terms, 3rd Ed., (http://www.bipm.org/utils/common/documents/jcgmlJCGM_200_2008.pdf) (Sep. 5, 2010).

MacKay, D. J. C. (1992). "Bayesian interpolation." Neural Comput., 4(3),415-447.

Mancebo, S., and Chamberlain, K. (2000). "Performance testing of theGarmin eTrex, Garmin GPSIII plus, Magellan GPS 2000XL, andMagellan Blazer recreation type global positioning system receivers."(http://www.fs.fed.us/database/gps/mtdc/gps2000/Nav_3-2001.htm)(Feb. 5, 2010).

Martin, A. A., Owende, P. M., and Ward, S. M. (2001). "The effects ofperipheral canopy on DGPS performance on forest roads." Int. J.For. Eng., 12(1),71-79.

McDonald, T. P., and Carter, E. A. (2002). "Using the global positioningsystem to map disturbance patterns of forest harvesting machinery."Can. J. For. Res., 32(2), 310-319.

McDonald, T. P., and Fulton, 1. P. (2005). "Automated time study ofskidders using global positioning system data." Computo Electron.Agric., 48(1), 19-37.

McCulloch, W. S., and Pitts, W. (1943). "A logical calculus of the ideasimmanent in nervous activity." Bull. Math. Biophys., 5, 115-133.

Murphy, G., Wilson, I., and Barr, B. (2006). "Developing methods forpre-harvest inventories which use a harvester as the sampling too!."Aust. For., 69(1), 9-15.

Nesset, E. (2001). "Effects of differential single- and dual-frequency GPSand GLONASS observations on point accuracy under forest canopies."Photogramm. Eng. Remote Sens., 67(9), 1021-1026.

Nresset, E., Bjerke, T., 0vstedal, O., and Ryan, L. H. (2000). "Contribu-tions of differential GPS and GLONASS observations to point accuracyunder forest canopies." Photogramm. Eng. Remote Sens., 66(4),403-407.

Ntesset, E., and Jonmeister, T. (2002). "Assessing point accuracy ofDGPS under forest canopy before data acquisition, in the field and afterpostprocessing." Scand. J. For. Res., 17(4),351-358.

Pinkus, A. (2008). "Approximation theory of the MLP model in neuralnetworks." Acta Numer., 8, 143-195.

Rodríguez-Pércz, J. R., Álvarez, M. F., and Sanz-Ablanedo, E. (2007)."Assessment of low-cost GPS receiver accuracy and precision in forestenvironments." J. Surv. Eng., 133(4), 159-167.

Sawaguchi, I., Nishida, K., Shishiuchi, M., and Tatsukawa, S. (2003)."Positioning precision and sampling number of DGPS under forestcanopies." 1. For. Res., 8(2), 133-137.

Sawaguchi, I., Saitoh, Y., and Tatsukawa, S. (2005). "A study of the effectsof stems and canopies on the signal to noise ratio of GPS signals."J. For. Res., 10(5), 395-401.

Sigrist, P., Coppin, P., and Hermy, M. (1999). "Impact of forest canopyon quality and accuracy of GPS measurements." Int. J. Remote Sens.,20(18), 3595-3610.

Smola, A. J., and Scholkopf, B. (2004). "A tutorial on support vectorregression." Stat. Comput., 14(3), 199-222.

Tachiki, Y., Yoshimura, T., Hasegawa, H., Mita, T., Sakai, T., andNakamura, F. (2005). "Effects of polyline simplification of dynamicGPS data under forest canopy on area and perimeter estimations."J. For. Res., 10(4), 19-427.

Taylor, S. E., McDonald, P., Fulton, J., Shaw, 1., Corley, F. w., andBrodbeck, C. J. (2006). "Precisión forestry in the southeast U.S." Proc.,1st 1nt. Precision Forestry Symp., University of Washington, Seattle,397-414.

Vapnik, V. (1995). The nature of statistical learning theory, Springer-Verlag, New York.

Yeh, T. K., Chen, y. J., Chung, y. D., Feng, C. w., and Xu, G. (2010)."Clarifying the relationship between quality of global positioningsystem data and precision of positioning." J. Surv. Eng., 136(1),41-45.

Yoshimura, T., and Hasegawa, H. (2003). "Comparing the precision andaccuracy of GPS positioning in forested areas." J. For. Res., 8(3),147-152.

JOURNAL OF SURVEYING ENGINEERING © ASeE / NOVEMBER 2011 /149

technical papers - grupos.unileon.esgrupos.unileon.es/ingecart/files/2013/05/jse_2011a.pdf · john...

Documents