rights / license: research collection in copyright - …7719/eth-7719-01.pdf · rtile - adaptive...

9
Research Collection Conference Paper RTILE - Adaptive rover navigation based on online terrain learning Author(s): Krebs, Ambroise; Pradalier, Cédric; Siegwart, Roland Publication Date: 2010 Permanent Link: https://doi.org/10.3929/ethz-a-010027777 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection . For more information please consult the Terms of use . ETH Library

Upload: phungtuong

Post on 26-Aug-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

Research Collection

Conference Paper

RTILE - Adaptive rover navigation based on online terrainlearning

Author(s): Krebs, Ambroise; Pradalier, Cédric; Siegwart, Roland

Publication Date: 2010

Permanent Link: https://doi.org/10.3929/ethz-a-010027777

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For moreinformation please consult the Terms of use.

ETH Library

Page 2: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

RTILE - Adaptive rover navigation based on online terrain learning

A. Krebs*, C. Pradalier**, R. Siegwart***

Autonomous Systems Lab, ETHZ, Switzerlande-mail: * [email protected]

** [email protected]*** [email protected]

Abstract

In the scope of this work, a framework, called RoverTerrain Interaction Learned from Experiments (RTILE),was developed to handle the uncertainties of the rover en-vironment. This framework allows the rover to learn aBayesian model associating the rover-terrain interaction(RTI) characteristics and the observation of the terrain ata distance. The same model can be used to predict theRTI characteristics and then, optimize the rover path ac-cordingly. This paper focuses on this optimizing process.A direct trajectory to reach a goal can be challenged aslong as it shows improvements in terms of rover-terraininteraction. This process has to take into account that thelearning mechanism is unsupervised and therefore mustbe dynamically updated. The CRAB rover is used in thiswork for the implementation and testing of the approach.

1 Introduction

In order to move a valuable payload safely and reli-ably, a rover has to interact with its environment, whichis by nature an unknown actor. Hence it seems interest-ing to design a framework allowing a rover to learn fromits experience while it operates in a mission, the final goalbeing to use the accumulated knowledge to optimize therover path. As the terrain is, at least partially, unknownand is extremely difficult to characterize beforehand, therover has to rely on unsupervised technics to build its ownrepresentation of its interaction with the terrain.

1.1 State of the artSeveral research works have been dedicated to allow-

ing a rover to learn from its interaction with the terrain.[1] proposes to learn the slippage model of the rover withrespect to the terrain type and geometrical characteristics.In [2] it is argued that an autonomous system, in additionto learning from training data, should be able to detectand classify new terrains. The authors propose a Gaussianmixture model for detection and classification of novel ter-rains. [3] proposes a classification method making use ofboth visual and vibration data, comparing several classi-fication methods. It shows great results at classifying ter-

rains encountered but fundamentally the process is a su-pervised one. Among those approaches, few attempt toanticipate the rover-terrain interaction. In [1], the terrainappearance is divided into well known classes providinga prior set of trained classifiers. The number and type ofthe terrain is then fixed and the slippage model is thenlearned for each one of those. In the case of [4], it is theother way around as the visual characteristics of the terrainare learned online while the rover-terrain characteristicsclassifiers are trained before hand. One way or another,those approaches rely on trained classifiers which impliesa given and fix number of classes.

1.2 ObjectivesTo improve this interaction, we developed a frame-

work called RTILE (Rover-Terrain Interaction Learnedfrom Experiments) that provides the rover with the capa-bility to link remote and local data. Sensors such as a cam-era, acquire data of the environment at a distance and thisremote data is used to build the remote terrain perception(RTP) model. Local data expresses the rover-terrain inter-action (RTI) model which can be related to proprioceptivesensors such as an IMU. Those models are learned on-line and this process occurs automatically, and is unsuper-vised in the sense that it requires no preliminary training,nor any inputs regarding the number of terrain classes tolearn. On the other hand, the model can be used to predictthe RTI characteristics of the terrain ahead of the rover.The rover path can then be optimized based on the char-acteristic rating the RTI, called RTI metric, MRT I . Thispaper focuses on the development subsequent to [6].

1.3 ContentThe next section gives an overview of the RTILE

framework, allowing a rover to learn an RTI and RTPmodel to optimize its path. Section 3 describes in moredetails the process of using the predicted RTI metric tooptimize the rover path. The following section presentsthe rover as well as the test environment for the experi-ments described in Section 5. The tests described therehighlights clearly the impact of the approach. The paperconcludes with a summary of the present research and itspossible future.

Page 3: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

2 Approach Overview

The generic concept of RTILE is presented in Fig. 1.The near to far part performs the local and remote dataassociations using a grid-based approach. A Bayesianmodel [7] is used to handle the uncertainties and its var-ious probability distributions are asserted in the learningpart. The prediction makes use of the knowledge acquiredto estimate the RTI ahead of the rover. Finally the pathplanning uses the predicted RTI to influence the rovertrajectory. Note also that here, the focus is not on the

SOFTWARE

HARDWARE Local Sensors Remote SensorsActuators

Path Planning

ProBT

Prediction

Database

DelayNear to far

Learning

Controller

Figure 1. RTILE schematic.

traversability (where the robot is able to drive or not) buton a complementary aspect, namely to differentiate the ar-eas traversable with respect to their terrainability (abilityto negotiate terrain irregularities) [8].

2.1 Probabilistic ModelThe idea is to express the connection between the RTI

and the RTP characteristics. The RTI characteristics, com-puted with the data from the local sensors, are named Fl.The RTP ones are named Fr. Note also that Fl and Fr

are not expressed directly with respect to each other butfirst clustered into classes, Kl and Kr respectively. TheBayesian model and its decomposed joint distribution isexpressed as follow:

P(Fr, Fl,Kr,Kl) = P(Kr)P(Kl | Kr)P(Fl | Kl)P(Fr | Kr).

Finally, what interests us is to answer the question:What are the predicted Fl, based on the observed Fr?Based on the Bayesian model, the following answer canbe given:

P(Fl | Fr) � P(Fl | Kr = kr), (1)

with the most likely remote class kr defined as:

kr = argmaxKr (P(Kr | Fr)) (2)

Eq. 2 corresponds to assuming the classes are well sepa-rated in the features space, meaning that their probabilitydistributions have to be peaked.

The various distributions of the joint decompositionare learned according to the rover experiment. Thus, thedistributions of F, the variable associated with the fea-tures, with respect to K, associated to the class number,

are based on Gaussian model. Furthermore as a grid basedapproach is used, the cell containing both RTI and RTPdata provide knowledge about how probable the connec-tions between the feature spaces are. It allows calculatingand updating P(Kl | Kr). Finally eq. 1 allows us to deter-mine the most probable predicted local features Fl basedon the remote features Fr.

2.2 Path PlanningE* [10], a grid-based path planner is used to drive

the rover and reach the goal. The rover and goal posi-tions being known, a navigation function is computed foreach cell, stating the path’s cost to reach the goal fromthis cell. The underlying technique is expressed withinthe continuous domain, which corresponds to a wavefrontpropagating from the goal toward the rover. The path toreach the goal can be found by using a gradient descentover the navigation function. The E* grid consists of cellsce and r(ce) is the difficulty or cost of this cell. This pa-rameter corresponds intuitively to the wavefront speed ofthe navigation function. In our work the propagation costis computed not only based on traversability, T , but inte-grates the predicted RTI. Thus, we have:

r(ce)−1 = f (Λ,T ). (3)

Λ depends on the evaluation of the RTI, called MRT I , com-puted based on the local features. T depends mainly onthe geometry of the terrain and can be considered as aBoolean value.

Λ = h(MRT I) with Λ ∈ [0; 1] (4)

T =

{1 if the cell is traversable0 otherwise (5)

Λ takes a high value for a terrain having a good rover in-teraction. For example, assuming that the rover faces twoterrains (named white and gray), as depicted in Fig. 2. Therover, which position is marked by the cross, has to reachthe goal on the right hand side. If the rover is able toidentify that Λwhite > Λgray, then the resulting generatedtrace (dashed) provided by E* naturally avoids the grayterrain and is a result of a gradient descent performed onthe navigation function (illustrated by the wavefront). Insummary, E* offers a tradeoff between the movement costand the path length and it provides a trace to be followedto reach the goal.

Figure 2. Wavefront propagation with E* [11].

Page 4: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

(a) Image acquired. (b) MRT I prediction.

(c) Propagation costs.

Figure 3. RTILE prediction steps.

3 Adaptive Control

The approach described so far allows predicting theRTI characteristics as shown in Figure 3(a) and 3(b). Thissection focuses on the process to obtain the propagationcosts as depicted in Figure 3(c).

3.1 Propagation CostThe idea is to be able to handle globally the knowl-

edge acquired by the rover and to have a function Nconnecting the propagation costs, r(ce) and the predictedMRT I . In fact, a default, straight trajectory to reach a way-point can be challenged by an infinite number of otherpossible trajectories offering different RTI metric. There-fore, the function N has to convey the idea of tradeoff

between what can be gained in terms of MRT I , meaningδMRT I , and the deviation it imposes from the default tra-jectory. It can be rephrased as:“given a waypoint distant of l m, the rover should be al-lowed a deviation of d m from a direct trajectory if Ter-rain 2, corresponding to M2

RT I , is an alternative to Ter-rain 1 with M1

RT I , and improves the RTI characteristics byδM1,2

RT I”.For each new Fr acquired, the predicted propagation

cost of a ce is computed as follows:

rp(ce) = N(δMRT I), (6)

and the E* grid is updated according to a very conservativerule:

r(ce)+ = min(r(ce), rp(ce)), (7)

where MRT I is the predicted RTI metric of ce and r(ce)+

its updated propagation cost.

Note that in the following subsections, we focus onthe terrainability, Λ, and the terrain is assumed to betraversable. Therefore, according to eq. 3:

T = 1⇒ r(ce)−1 = Λ, (8)

and to simplify the writing we use Λ in the next parts ofthe text to express the propagation costs.

3.2 Cost FunctionThe first step is to establish the relation between

δMRT I and Λ, in conformity to the requirement that fol-lows. This relation has to be true whichever MRT I is as-serted, and consistently with the overall RTI model, or thenumber of terrains “known”. Let us consider three ter-rains, respectively named A, B and C. Knowing that theirRTI metric is as follows:

MART I < MB

RT I < MCRT I , (9)

their difference can be computed as:

δMA,CRT I = MC

RT I − MART I = δMA,B

RT I + δMB,CRT I . (10)

The E* algorithm, using the LSM1 kernel, has the propa-gation costs of its cell defined as:

r(ce)−1 = Λ ∈ [0, 1], (11)

where a value of 1 represents a cell with the fastest wave-front propagation (and the lowest propagation cost), andtherefore the best RTI metric. Thus, in order to use thefull scale available (eq. 11), we assign the terrain showingthe best MRT I a Λ value of 1. Assuming the RTI metric israted according to “the smaller the better”, the terrain ofreference is A, which gives us:

Λre f ,A = 1,

Λre f ,B = N(δMre f ,BRT I )−1,

Λre f ,C = N(δMre f ,CRT I )−1 = N(δMre f ,B

RT I + δMB,CRT I)

−1.

(12)

In order to have the different terrains expressed con-sistently within E*, their propagation costs need to be ex-pressed proportionally. Thus one can express:

Λre f ,C = Λre f ,A ·ΛA,B ·ΛB,C = N(δMre f ,BRT I +δMB,C

RT I)−1, (13)

and the equations above are solved by defining the propa-gation costs as:

Λre f ,t = C−δMre f ,t

RT ICα

β = R(Mre fRT I ,M

tRT I) = N(δMre f ,t

RT I )−1. (14)

R is a function expressing the relative propagation cost ofa terrain t with respect to a reference terrain re f . Theparameters Cα and Cβ still need to be defined and theyallow us to control the tradeoff between the deviation fromthe default trajectory and the resulting improvement withrespect to the RTI metric.

1This stands for Level Set Method and express the type of interpola-tion method used within E*. LSM is the best E* kernel [11].

Page 5: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

3.3 “Unknown” Propagation CostThe computation of the propagation cost is almost

complete, except for the cost of the areas that are notyet observed, or classified as “unknown”. As the corre-sponding MRT I is unknown, a rule must be defined thatwill greatly impact the rover behavior. Thus a very con-servative rule could define the RTI characteristic as badas the worst terrain encountered so far (worst case) andthe opposite rule, considering the “unknown” terrains aspreferable as the best known so far, would result in a veryadventurous. In our case, the propagation cost of the “un-known” terrain is rated based on the mean value of thebest and worst terrain known.

4 Setup

The CRAB rover is a robot with six motorized wheelsand a passive suspension system. The suspension systemis composed of two symmetrical structures such as de-picted in Figure 4. Also note that the CRAB rover concept

Figure 4. CRAB suspension mechanism.

is described with details in [12]. The following sensors areprimarily used in the context of this paper:

IMU An IMU, MT-9B from Xsens, is mounted on thebody of the CRAB. It provides the absolute orien-tation (Euler angles, ψi with i being x,y or z) of thechassis.

Monocular Camera An HD webcam from Logitech(Quickcam Pro 9000) provides two megapixel im-ages of what lies ahead of the robot within a 53 o fieldof view.

4.1 Feature Spaces and MetricsTwo feature spaces are used. The first one corre-

sponds to the RTI characteristics and uses the data pro-vided by the IMU. It represents the vibrations of the roverchassis, corresponding to the soil Softness:

F so f tnessl = (‖ψx‖, ‖ψy‖), (15)

with ψi, corresponds to the rover chassis angular accelera-tion along axis i. Note also that as it is a series of data, themean value is simply computed over the sequence corre-sponding to the grid cell treated.

The second feature space corresponds to the RTPcharacteristics and uses the camera data. It is a color de-scription which corresponds to the soil Appearance.

Fappearancer = (∆RG,∆GB), (16)

with

∆RG =R −G

v, (17)

∆GB =G − B

v, (18)

v = max(R,G, B). (19)

The divisor is named v as it is exactly how the V value isdefined in the HSV color space.

As the whole idea of optimizing the path of the roverrequires to determine the criteria driving this optimization,and therefore, choosing the function rating the RTI, MRT I .According to the test environment and in order to show theimpact of the overall method, we chose to minimize theamount of vibration within the structure, thus defining:

MRT I =

√F so f tness

l,12

+ F so f tnessl,2

2. (20)

Also note that to express what happened during thetest, it is necessary to define metrics summarizing therover behavior. Thus, two metrics are defined here, onecorresponding to the distance traveled and the other cor-responding to the softness of the terrain:

MDist =∑ √

∆x2 + ∆y2, (21)

MS o f t =

√‖ψx‖

2+ ‖ψy‖

2. (22)

4.2 Environment

Figure 5. Test environment.

The test environment for the experiments is depictedon Figure 5. Three terrains are primarily used:

Asphalt Normal street pavement, the hardest terrain.

Page 6: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

(a) Test setup. (b) E* propagation costs.

Figure 6. Part b: Starting on grass.

Grass Football field, the softest terrain of all three.

Tartan Athletic running track with a red appearance.

Thus we have:

MGrassRT I < MTartan

RT I < MAsphaltRT I (23)

with the MRT I rated according to “the smaller the better”.

4.3 ParametersIn the test presented in the next section, consistent and

constant parameters where used. Thus, the camera hori-zon, or maximum distance where the RTP feature are ac-quired, is 5 m. The rover speed is set to 6 cm · s−1, eventhough this speed is reduced when the rover turns for ex-ample. When learning activity is required, it is triggeredby the distance traveled and occurs every 6 m. Finally, theconstants Cα and Cβ, corresponding to the rover aggres-siveness are respectively set to 0.4 and 4. It correspondsto a prior of the user knowing the δMRT I between grassand asphalt and wanting the rover to reach the optimalterrain if it is observed.

5 Experiments

This section is dedicated to presenting the resultshighlighting the impact of the whole RTILE approach.Two scenarios are primarily presented in the following.

5.1 Test: Facing an “Unknown” TerrainThe test is divided in three parts. The setups of part b

and c are depicted on Figure 6(a) and Figure 7(a).In the first part, named a, the rover is asked to reach a

first waypoint 15 m ahead, and then another 30 m ahead ofthe starting position. Performing a straight, direct trajec-tory, the rover moves first on asphalt, and then on grass.This allows the rover to learn a reliable description of bothterrains. This knowledge is then used as a prior for partsb and c, where no more learning is performed.

In part b, the rover aims for a goal positioned 12 mmeters ahead and a straight trajectory would drive the

(a) Test setup. (b) E* propagation costs.

Figure 7. Part c: Starting on asphalt.

rover on grass for half of the trajectory, and then on tar-tan. However, the terrain description learned in part a isused as a prior and the idea is to start from the best knownterrain and to head toward an unknown terrain, the tartan.The test is also repeated five times to have more reliableresults.

In the last part, c, the rover heads toward the samegoal as in part b. This time, a straight trajectory drivesthe rover on asphalt for half of the trajectory, and then ontartan. Again, the terrain description learned in part a isused as a prior. Accordingly, the rover starts from theworst known terrain and heads toward an unknown area.Similarly, the test is repeated five times.

0 2 4 6 8 10 12 14-1

0

1

2

3Rover trajectories

x [m]

y [m

]

RTILE

Default

(a) Part b – “unknown” area avoided

0 2 4 6 8 10 12 14-1

0

1

2

3

x [m]

y [m

]

RTILE

Default

(b) Part c – “unknown” area looked for

Figure 8. RTILE and default trajectories.

5.1.1 ResultPart a While moving on asphalt and then on grass, therover has the opportunity to integrate those terrains both toits RTI and RTP models. The RTI metric of both terrainsare the following:

MAsphaltRT I = 0.67, (24)

MGrassRT I = 0.14. (25)

This corresponds to the prior used for part b and c.

Page 7: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

Part b The rover starts on grass and heads toward tar-tan, which is unknown at this point. The knowledge ac-quired in part a leads to:

MGrassRT I = Mre f

RT I , (26)

MunknownRT I = 0.4. (27)

The resulting trajectories can be observed in Figure 8(a).It can be seen that the rover tries to remain as long as pos-sible on grass in order to avoid the “unknown” area. Thenavigation costs of the whole environment correspondingto the knowledge accumulated during the test can be seenin Figure 6(b). The lighter area corresponds to the betterterrain, grass. Its upper border shows the limit betweengrass and tartan, whereas the lower one corresponds tothe limit of the camera field of view. The propagation costcorresponding to the “unknown” area is analyzed in thesummary of the test.

Part c The rover starts this time on asphalt and movestoward tartan, having the same consideration as above.The resulting trajectories can be observed in Figure 8(b).The rover tries to avoid asphalt and diligently reaches the“unknown” area. Similarly to part b, the E* navigationcosts are shown in Figure 7(b). The darker area, close tothe starting point, corresponds to the asphalt.

5.1.2 SummaryThe following tendencies can be observed:

• The rover remains on the same terrain when it is thebest known so far.

• The rover looks for a better terrain when on a “bad”terrain.

In fact the results of this experiment are qualitative. Inparts b and c, the tartan represents an “unknown” areabut its RTI metric could be better than the best known sofar, or on the contrary, worse than the worst. Therefore, itis difficult to determine what the “right” trajectory is andany quantitative comparison with the default trajectory ismeaningless.

5.2 Test: RTILE Complete ApproachThe rover is asked to reach a series of waypoints

which, in conjunction with the environment depicted inFigure 5, corresponds to an interesting trajectory with re-spect to RTILE. For the purpose of explaining the testprogress, the notion of section of trajectory, is introducedhere. Thus section i refers to the trajectory part betweenwaypoints i − 1 and i.

Section 1 proposes the rover with a 13 m trajectory ontartan, followed by two meters on grass. The next sectioncorresponds to a 15 m trajectory on grass and sections 3and 4 present the rover with trajectories on asphalt, which

are 15 m and 12.5 m long respectively. The different ori-entation of those four sections provides the trajectory witha square-like shape.

Section 5 and 6 are both 12.5 m long trajectories onasphalt, but they are positioned just beside other terraintypes. The idea is to offer the rover an alternative, whichcan be exploited by RTILE. Therefore, section 5 and 6have tartan and grass on the left hand-side on a default,straight trajectory linking the waypoints. Also note thatthe rover starts the test without any prior, therefore itmakes only use of the RTI and RTP models learned duringthe test.

5.2.1 ResultThe test was performed twice using the RTILE ap-

proach and twice to have a reference, or default trajectory.The resulting trajectories can be observed in Figure 9 andthe result metrics computed for each section are presentedin Figure 11.

Section 1 to 4 It can be observed that the rover used thefirst four sections to learn a description of the three ter-rains. In fact, the corresponding trajectories are not chal-lenged with respect to the default trajectory, this is partic-ularly true for section 1 and 2, where the RTILE trajectorysuperposes exactly with the default one. The RTILE tra-jectory of sections 3 and 4 show some variations which area conjunction of two elements: The limited field of viewof the camera and the bad RTI metric associated with as-phalt, the worst terrain of the test. Once asphalt has beenlearned (Figure 9 area a), its corresponding RTI metric isworse than the one associated with the “unknown” terrain.The rover is surrounded by asphalt, but the areas out ofthe camera field of view are of class “unknown”. Hencethe rover attempts to drive around the observed asphalt,resulting in the behaviors of those sections.

Section 5 to 6 In sections 5 and 6, the rover altered bothtimes its default trajectory to profit from the better terrainon the left hand-side. According to the knowledge modellearned, the terrains have the following RTI metrics:

MTartanRT I = 0.26, (28)

MGrassRT I = 0.11, (29)

MAsphaltRT I = 0.51. (30)

Although both grass and tartan have lower RTI met-rics than asphalt, the amplitude of the difference is differ-ent. This is observed in the behavior of the rover as wellas the fact that the rover enters tartan (Figure 9 area b)less abruptly than grass (Figure 9 area c).

Finally, the different behavior between both RTILEruns at the beginning of section 5 (Figure 9 area d) can

Page 8: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

be explained by the acquisition and processing of cameraimages, occurring at different times.

Despite the variation that can be observed betweenthe RTILE trajectories, the corresponding metrics are verysimilar as the very low variation shows it in Figure 11.

5.2.2 SummaryThe test presented here combines both the learning

and the prediction aspects for three terrains being learneddynamically and it shows nice and expected results. TheRTILE trajectory has a slightly longer length but the vi-brations within the chassis are significantly reduced whenpossible (see Figure 11 sections 5 and 6).

Figure 9. RTILE and default trajectories.

(a) Point of interest e (b) Point of interest f

(c) Point of interest g (d) Point of interest h

Figure 10. E* propagation costs grids.

Figure 10 shows the E* wavefront propagation costsgrid at different locations (depicted as “points of interest”

in Figure 9) which correspond to different terrain repre-sentation as well. Thus the overall figure shows the prop-agation costs are automatically adapted.

• At the beginning, no prior exists and the costs are ini-tialized at the lowest value (Figure 10(a)). The firstterrain learned (i.e. tartan) does not affect this rep-resentation as no point of comparison exists at thistime.

• The second terrain learned (i.e. grass) results in anupdate of the E* propagation costs. The best terrain“observed” (i.e. grass) sees its propagation cost un-changed and remain at the lowest value (lighter areain Figure 10(b)). The rest of the grid, correspondingto the area unobserved, has its propagation costs setaccording to a RTI metric averaging the “best” and“worst” terrain known.

• As the rover reaches the end of section 2, the gridhas been continuously updated during the movementof the rover and a light brown band can be seen onFigure 10(c), showing the area where grass was ob-served.

• Finally, Figure 10(d) shows the propagation costs as-sociated with section 6 at the very end of the test. Thelight brown area corresponds to grass, the blue areacorresponds to asphalt whereas the rest correspondsto areas that were not observed during the test. Notethat those costs were adjusted once more when as-phalt, a new “worst” terrain was learned.

6 Conclusions

The terrains classes are learned according to the roverown representation, based on the features representing thelocal and remote characteristics. In this respect, the fea-tures used in the context of the experiments should beadapted for space application, but the unsupervised ap-proach and cost adaptation methodology proposed is veryvaluable. More concretely, the knowledge acquired dur-ing the mission, and the corresponding terrain descrip-tion models need to be handled consistently if the learningmechanism generates a new class, or if the parameters ofthe existing classes are updated. In order to successfullysolve this issue, a function connecting the terrainability,Λ, and the predicted RTI metric compared to a reference,δMRT I , is introduced:

Λ = CδMRT I

Cαβ , (31)

where Cα and Cβ define the “aggressivenes” of the roverto look for the best terrain. Thus RTILE allows a rover

Page 9: Rights / License: Research Collection In Copyright - …7719/eth-7719-01.pdf · RTILE - Adaptive rover navigation based on online terrain learning A. Krebs*, C. Pradalier**, R. Siegwart***

3210

x

x

x

x

x

x

x

x

x

x

x

x

o

o

o

o o

o

o

o

o

o

o

o

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

4 5 6

MSo

ft [rad

. s-2]

Section [-]

(a) Vibration metric: MS o f t

x

x

x

x

xx

x

x

x

x

x

xo

oo

oo o

o oo

oo

o

0 1 2 3 4 5 612

13

14

15

16

17

18

19

MD

ist [m

]

Section [-]

(b) Distance metric: MDist

Figure 11. RTILE (×) and default (◦) results.

to learn and use the knowledge of an unlimited number ofterrains which is very useful for planetary rovers.

At the moment, Cα and Cβ are established by an initialguess from the user, but it would be extremely interestingto close the loop and allow the rover to learn those pa-rameters as well. This requires to be able to rate globallythe trajectory performed by the rover and in this regard, ametric such as the energy consumption could be used.

7 Acknowledgment

This work was partially funded by ESA. Specialthanks to Cunegonde for her help.

References

[1] A. Angelova, L. Matthies, D. Helmick, andP. Petrona, “Learning and prediction of slip from vi-

sual information,” Journal of Field Robotics, vol. 24,no. 3, pp. 205–231, 2007.

[2] C. Weiss and A. Zell, “Novelty detection and onlinelearning for vibration-based terrain classification,”in 10th International Conference on Intelligent Au-tonomous Systems (IAS), (Baden-Baden, Germany),July 2008.

[3] I. Halatci, C. Brooks, and K. Iagnemma, “A study ofvisual and tactile terrain classification and classifierfusion for planetary rovers,” Robotica, vol. 26, no. 6,pp. 767–779, 2008.

[4] C. Brooks and K. Iagnemma, “Self-supervised ter-rain classification for planetary rovers,” in IEEEAerospace Conference, (Big Sky, Montana, USA),March 2007.

[5] D. Helmick, A. Angelova, and L. Matthies, “Terrainadaptive navigation for planetary rovers,” Journal ofField Robotics, vol. 26, pp. 391–410, August 2009.

[6] A. Krebs, C. Pradlier, and R. Siegwart, “Adaptiverover behavior based on online empirical evaluation:Rover-terrain interaction and near-to-far learning,”Journal of Field Robotics, vol. 27, no. 2, pp. 158–180, 2010.

[7] P. Bessiere, C. Laugier, and R. Siegwart, eds.,Probabilistic Reasoning and Decision Making inSensory-Motor Systems. Springer Tracts in Ad-vanced Robotics, Berlin / Heidelberg: Springer,2008.

[8] D. S. Apostolopoulos, Analytical Configuration ofWheeled Robotic Locomotion. PhD thesis, CarnegieMellon University (CMU), 2001.

[9] D. Ferguson and A. Stentz, “Field d*: Aninterpolation-based path planner and replanner,” in12th International Symposium on Robotics Research(ISRR), (San Fransisco, CA, USA), October 2005.

[10] R. Philippsen, B. Jensen, and R. Siegwart, To-wards Real-Time Sensor-Based Path Planning inHighly Dynamic Environments. Berlin / Heidelberg:Springer Tracts on Advanced Robotics, 2006.

[11] R. Philippsen, “A light formulation of the e* inter-polation path replanner,” tech. rep., Ecole Polytech-nique Federale de Lausanne (EPFL), June 2006.

[12] T. Thueer, Mobility evaluation of wheeled all-terrainrobots: Metrics and application. PhD thesis, Eid-genossische Technische Hochschule Zurich (ETHZ),January 2009.