map representation for robots

8/9/2019 Map Representation for Robots

http://slidepdf.com/reader/full/map-representation-for-robots 1/10

Smart Computing Review, vol. 2, no. 1, February 2012

DOI: 10.6029/smartcr.2012.01.002

18

Smart Computing Review

Map Representation forRobotsChuho Yi1 , Seungdo Jeong2 and Jungwon Cho3

1 Department of Electronics and Computer Engineering, Hanyang University / Seongdong-gu, Seoul 133-791,South Korea / [email protected]

2 Department of Information and Communication Engineering, Hanyang Cyber University / Seongdong-gu,Seoul 133-791, South Korea / [email protected]

3 Department of Computer Education, Jeju National University / Jeju-si, Jeju-do 690-756, South Korea / [email protected]

* Corresponding Author: Seungdo Jeong

Received November 20, 2011; Revised January 16, 2012; Accepted January 25, 2012; Published February 29, 2012

Abstract: Map-building and localization for robots are the most basic technology required to create

autonomous mobile robots. Unfortunately, they are difficult problems to comprehensively handle.

If expensive sensors or a variety of external devices are used, then the problems can be resolved.

However, there are still limits for various environments or platforms. Therefore, many researchershave proposed various different methods over a long period of time, and continue to do so today. In

this paper, we first look at the state of existing research for map representations used in map-

building and localization. We divide them into four main categories and compare the differences between them. These identified properties between the four categories can be used as good

standards for choosing appropriate sensors or mathematical models when creating map-building

and localization applications for robots.

Keywords: Map Representation, Robot map, Metric and topological map, Topological and semantic-

metric map

Introduction

any kinds of animals and humans have no difficulty to remember their environment and to navigate from one place

to another. Animals and humans do not necessarily use accurate quantitative information to perceive space in thecurrent location or for traveling to another location. Instead, they remember a few landmarks that define the space. Based

on a specific structure or distinct objects, they restructure their knowledge based on spatial context and then reuse that

knowledge [1]. On the contrary, a robot builds a map with vast amounts of numerical data processed from a sensor, and

then calculates its own location using a mathematical model. However, this process has many problems, such as the

accumulation of errors over time. The issues of map-building and localization for robots at the same time, called

Simultaneous Localization and Mapping (SLAM), are surveyed by Durrant-Whyte and Bailey [2, 3]. In this paper, we will

introduce map representation used in map-building and localization. Map representation is a method used to make simple

M



Smart Computing Review, vol. 2, no. 1, February 2012 19

map data and then compare it with the observed data. However, this is an important criterion that can be changed according

to the mathematical model, the sensor, or the data type.

In this paper, we classified map representations for robots into four different categories based on how the

representations handle observed data. The first category is metric representation, which includes methods that are used to

build a map with accurate metric data. The second category is metric and topological representation, which uses both

metric and topological representation to generate reference points as nodes, then build a map with metric data for the

observation. The third category is metric, topological, and semantic representation, which adds in semantic information tothe previous two types of data in order to keep track of objects such as doors. The fourth category is called topological and

semantic-metric representation, and refers to techniques using spatial relationships between objects, or between the robot

and objects.

Figure 1 depicts the four types of map representations categorized in this paper. In addition, under each type of map

representation are shown the research institutions which have used the method, and the method's mathematical model.

Map Representation

■ Metric Representation

The first category involves using a variety of sensors to build a map and to estimate the location of the robot more

accurately. Thrun et al. solves the off-line full SLAM problem with laser sensors using sparse information matrices inwhich observed landmarks are assumed to be independent [4, 5]. In other words, the so-called Fast SLAM is decomposed

to allow a covariance matrix to solve the problem with large amounts of data, thereby reducing computation costs. In thismethod, all landmarks have low-level features and metric data.

Figure 2 (a) shows a robot equipped with the laser sensor and used in Fast SLAM. The upper left image of Figure 2 (b)depicts the target of some outdoor ruins. The remaining scenes in Figure 2 (b) are the results of 3D map-building with

points measuring from the laser sensor.

Ayache and Lustman present a method for estimating the location of features in three-dimensional space using three

cameras, and for adapting that data to the robot [6]. Sorrenti et al. Speak about a Trinocular SLAM(6 DoF) with three

cameras using edge segments as features [7]. They build sub-maps of a local area and connect them in a graph structure to

solve the closed-loop problem. They solve the data association problem for landmarks using line segments and their

geometric structure.



Yi et al.: Map Representation for Robots20

Figure 3 (a) shows a robot equipped to use Trinocular SLAM. There are so many types of sensors because they aremade to create an open dataset for robot research. The part of a red circle depicts 3 cameras used in Trinocular SLAM.

Figure 3 (b) shows the input images from each camera and extracted edge lines in blue. The Trinocular SLAM builds a map

and estimates the location of the robot using extracted lines from each camera in 3D space.

Recently, researchers have proposed a 6 DoF SLAM with a single camera. Davison et al. propose a Mono SLAM that

estimates the camera position and the 3D location of landmarks using tracking features [8]. This type of SLAM uses an

Extended Kalman Filter (EKF) and a single hand-held camera [9, 10]. This method provides high levels of accuracy for the

3D camera location and, if feature detection is stable and continuous, it also allows for intuitive map-building. However,

because the map and robot location both depend on visual features that must be updated constantly, Mono SLAM must

overcome difficulties that emerge when the camera loses focus temporarily.

Figure 4 (a) shows the experimental scene of a person with a hand-held camera. The features in Mono SLAM are

defined as the corners, and they must be tracked constantly. Figure 4 (b) depicts boxes that drawn around the corner points

that are used as features in the room. Figure 4 (c) shows the observed corner landmarks as red and yellow dots in the map

and the result of the estimated camera’s location (yellow line).




■ Metric and Topological Representation

The second category is able to solve accumulating errors because the metric representation only maintains a single

coordinate system in a local map. Ranganathan and Dellaert propose a method that generates a local map and connects

these local maps into a comprehensive topological map. This method uses a laser sensor and multiple cameras to solve

critical issues when creating the topological map [11, 12]. One of these critical issues is detecting nodes and landmarks

automatically when the robot makes a topological map. Another is checking a closed-loop during online mapping, which is

difficult because the position of the robot accumulates errors. They determine distinctive spaces (nodes) and landmarks

using Bayesian surprise with a 360-degree laser sensor and eight cameras to collect the data. Next, the method computes aclosed-loop, despite the accumulated errors, by considering all possible cases for node connections. Computation is

minimized by applying the Markov Chain Monte Carlo (MCMC) method, which provides a detailed solution for

topological map-building problems.

Figure 5 (a) shows the result of node detection using Bayes surprise in which the value is calculated automatically with

a laser sensor and stereo camera. The metric and topological maps are built based on automatically-selected nodes. Figure 5

(b) depicts the probabilistic calculations for various hypotheses for the closed-loop problem connected between nodes.

Blanco et al. propose a hybrid SLAM method that generates a coordinate system in a local continuous state and

simultaneously connects nodes in discrete states [13, 14]. In other words, this method extends the existing method for

metric-space SLAM by considering metric and topological map-building simultaneously. This approach makes a map in a

hybrid discrete-continuous state space, estimating the robot's position and then making a local coordinate system and

building a map based on it. They use a local coordinate system, but identify each point of the local coordinate system usinga laser sensor.




Figure 6 (a) shows the result of a local map to determine the position of a node and the local coordinate system from the

laser sensor. Figure 6 (b) depicts the result of a hybrid (metric and topological) map with determined nodes (white circles)and local-numeric points around at the same time.

■ Metric, Topological, and Semantic Representation

Regarding the third category, Vasudevan et al. propose a method that involves a metric map with a laser sensor and a stereocamera (SIFT feature). They construct a hierarchical probabilistic representation containing doors and objects as topologies

[15, 16]. They propose a probabilistic object graph representation that includes observed objects and doors, and then

classifies the place using prior knowledge of objects. This approach generates a metric map with a laser sensor and stereo

camera, and then generates a semantic map with objects and doors on the topological representation. This kind of map can

be used for room classification based on observed objects in a specific category.

Figure 7 (a) shows the result of the map with doors (green stars and red circles) and objects (blue triangles) as numerical

statements. Figure 7 (b) depicts the probabilistic object graph as a semantic map based on independent space distinguished

by the door. In this study, the place is classified as an office, kitchen, or laboratory using a semantic map which consists of

pre-classified categories for objects. For example, if there is an oven, kettle, and dishes in a particular space, then that space

is classified as a kitchen.

Choset et al. propose a method that detects a distinctive space as a node using a Generalized Voronoi Graph (GVG)

with a laser sensor [17, 18]. This approach generates both a low-level feature-based map and a high-level topological map.




This approach has the advantage of building and managing a topological map, and the ability to mathematically and clearly

detect a distinctive space as a node.

Figure 8 shows the resulting map with nodes automatically determined by GVG, and the topological map is built based

on nodes. At this time, the node is a standard to distinguish the distinctive place.

■ Topological and Semantic-Metric Representation

Kawamura et al. propose an egocentric navigation system: the robot moves to a goal based on observed landmarks [19, 20].

This approach models the sensory egosphere (SES) in the form of a dome based on current observations, and the goal isrepresented as the landmark egosphere (LES). Next, the robot's goal direction is determined by comparing the measured

angles between current landmarks and memorized angles between landmarks at the goal. This method uses egocentric

navigation to determine the robot's direction from its own perspective, but the representation is too simple to find an

alternative path if there is an obstruction. In addition, it is necessary to observe more than two landmarks when moving to

the goal position because of the lack of map-building and localization.

Figure 9 (a) shows the exterior appearance of the robot used in egocentric navigation with multi-camera sensors. In this

study, the landmark is an artificial object made by simple colors and patterns to distinguish easily. Figure 9 (b) depicts the

result of a simple map generated with simple pattern landmarks. The circle with a dotted line is an observed landmark

based on the robot. Each landmark is represented by its relationships to other landmarks.

Yi et al. proposed a semantic representation and a Bayesian model for spatial relationships among objects. This

representation is useful for mobile robot localization and navigation [21, 22]. The robot moves around the area and makes

observations as it generates a map, using localization in local terms. Additionally, the robot moves to another space and

uses that position as a waypoint in global terms. The spatial relationship used in the proposed semantic representationincludes the observed objects, the node-to-object (n-o) distance relationship, the n-o bearing relationship, and the object-to-

object (o-o) bearing relationship. The node-to-object (n-o) distance relationship represents the distance from the node to a

particular object. The n-o bearing relationship represents the direction from the node to a particular object. And finally, the

object-to-object (o-o) bearing relationship denotes the relationships among objects. They use visual pattern recognition to

recognize objects and estimate rough metric data [23].




Figure 10 (a) illustrates the indoor space, with objects and nodes marked on a ground-truth map. The green circles

indicate the nodes in the semantic map, and the yellow and blue square boxes represent doorplates. Figure 10 (b) provides a

graphical representation of this process.

Figure 11 illustrates the topological-semantic-metric map of the corridor, consisting of 15 nodes (yellow rectangles) and

42 objects (green circles). The topological-semantic-metric map is included as an ontological representation for robot

knowledge.




Table 1 compares existing methods to the proposed method. Methods are classified by type of map. Most methods haveused laser sensors, with the exception of Mono SLAM, which uses a single camera. Methods involving local coordinate

systems or topological representation need a laser sensor to ensure accurate coordinates. However, Yi’s method generates a

local coordinate system when an object is observed, and does not require a laser sensor. Most methods involve low-level

and metric features, with the exception of Yi’s proposed method and the ETH method. The ETH method is used in place

classification to distinguish room type (e.g., kitchen, office) based on semantics when particular objects are observedsuccessively in one place. In contrast, Yi’s method uses action selection for active localization, enabling more accurate

estimation of the robot's position based on semantics when objects in the semantic map are observed.




Conclusions

In this paper, we reviewed map representation methods for robots. We classified map representations into four types and

introduced the latest state of the art for each type. The first category was metric representation, which included methods

that are used to build a map with accurate metric data. The second category was metric and topological representation,

which used both metric and topological representation to generate reference points as nodes, then build a map with metric

data for the observation. The third category was metric, topological, and semantic representation, which added in semanticinformation to the previous two types of data in order to keep track of objects such as doors. The fourth category was called

topological and semantic-metric representation, and refers to techniques using spatial relationships between objects,

including the robot. The differences in the four types were found to be differences between sensors, coordinate systems,

SLAM models, features, and semantics. This evaluation can be useful for those who want to begin their own map-building

and localization robot applications.

References

[1] H. Choset, K. Lynch, S. Hutchinson, G. Kantor, W. Burgard, L. Kavrakij, S. Thrun, “Principles of robot motion -

theory, algorithms, and implementations,” MIT-Press, 2005.

[2]

H. Durrant-Whyte and T. Bailey, “Simultaneous localisation and mapping (SLAM): part I the essential algorithms,” IEEE Robotics and Automation Magazine, pp. 99-108, 2006. Article (CrossRef Link)

[3] T. Bailey, H. Durrant-Whyte, “Simultaneous localization and mapping (SLAM): part II,” IEEE Robotics and

Automation Magazine, pp. 108-117, 2006. Article (CrossRef Link)

[4] S. Thrun, W. Burgard, D. Fox, “Probabilistic Robotics,” MIT Press, Cambridge, MA, 2005.

[5] M. Montemerlo, S. Thrun, D. Roller, B. Wegbreit, “FastSLAM 2.0: An improved particle filtering algorithm for

simultaneous localization and mapping that provably converges,” in Proc. of the International Joint Conference on

Artificial Intelligence, pp. 1151-1156, 2003.

[6] N. Ayache, F. Lustman, “Trinocular stereo vision for robotics,” IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 13, no. 1, pp. 79-85, 1991. Article (CrossRef Link)

[7] D. Sorrenti, M. Matteucci, D. Marzorati, A. Furlan, “Benchmark solution to the stereo or Trinocular SLAM - Bicocca

2009-02-25b BP, 2009,” http://www.rawseeds.org/rs/solutions/view/45.

[8]

J. Shi, C. Tomasi, “Good features to track,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 593-600, 1994.

[9] A. Davison, I. Reid, N. Molton, O. Stasse, “MonoSLAM: real-time single camera slam,” IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1-24, June 2007. Article (CrossRef Link)

[10] J. Civera, A. Davison, J. Montiel, “Inverse depth parametrization for monocular slam,” IEEE Transactions on

Robotics, vol. 24, no. 5, pp. 932-945, Jan. 2008. Article (CrossRef Link)

[11] A. Ranganathan, F. Dellaert, “Online probabilistic topological mapping,” International Journal of Robotics Research,vol. 30, no. 6, pp. 755-771, May 2011. Article (CrossRef Link)

[12] A. Ranganathan, F. Dellaert, “Bayesian surprise and landmark detection,” in Proc. of the ICRA 2009, pp. 2017-2023,

2009. Article (CrossRef Link)

[13] J. Blanco, J. Fernandez-Madrigal, J. Gonzalez, “Toward a unified Bayesian approach to hybrid metric-topological

SLAM,” IEEE Transactions on Robotics, pp. 259-270, 2008. Article (CrossRef Link)

[14]

J. Blanco, J. Fernandez-Madrigal, J. Gonzalez, “A new approach for large-scale localization and mapping: hybridmetric-topological SLAM,” in Proc. of the IEEE International Conference on Robotics and Automation, pp. 2061-

2067, 2007. Article (CrossRef Link)

[15] S. Vasudevan, R. Siegwart, “A Bayesian approach to conceptualization and place classification: incorporating spatial

relationships (distances) to infer concepts,” in Proc. of the IEEE IROS, workshop From Sensors to Human SpatialConcepts (FS2HSC), 2007.

[16] S. Vasudevan, V. Nguyen, R. Siegwart, “Cognitive maps for mobile robots - an object based approach,” in Proc. of the

IROS , pp 7-12, 2006.

[17] H. Choset, K. Nagatani, “Topological simultaneous localization and mapping (SLAM): toward exact localization

without explicit localization,” IEEE Transactions on Robotics and Automation, vol. 17, no. 2, pp. 125-137, 2001.

Article (CrossRef Link)

[18] B. Lisien, D. Morales, D. Silver, G. Kantor, I. Rekleitis, H. Choset, “The hierarchical atlas,” IEEE Transactions on

Robotics and Automation, vol. 21, no. 3, pp.473-481, June 2005. Article (CrossRef Link)

http://dx.doi.org/10.1109/MRA.2006.1638022






http://dx.doi.org/10.1109/34.67633

http://dx.doi.org/10.1109/34.67633

http://dx.doi.org/10.1109/34.67633

http://dx.doi.org/10.1109/TPAMI.2007.1049



http://dx.doi.org/10.1109/TRO.2008.2003276



http://dx.doi.org/10.1177/0278364910393287

http://dx.doi.org/10.1177/0278364910393287

http://dx.doi.org/10.1177/0278364910393287

http://dx.doi.org/10.1109/ROBOT.2009.5152376









http://dx.doi.org/10.1109/70.928558

http://dx.doi.org/10.1109/70.928558





http://dx.doi.org/10.1109/70.928558




http://dx.doi.org/10.1177/0278364910393287



http://dx.doi.org/10.1109/34.67633






[19] K. Kawamura, A. Koku, D. Wilkes, R. P. Ii, A. Sekmen, “Toward egocentric navigation,” International Journal of

Robotics and Automation, pp. 135-145, 2002. Article (CrossRef Link)

[20] T. Keskinpala, D. Wilkes, K. Kawamura, A. B. Koku, “Knowledge-sharing techniques for egocentric navigation,”

IEEE Conference on Systems, Man and Cybernetics, pp. 2469-2476, 2003. Article (CrossRef Link)

[21] C. Yi, I. H. Suh, G. H. Lim, B. U. Choi, “Bayesian robot localization using spatial object contexts,” in Proc. of the

IEEE/RSJ International Conference on Intelligent RObots and Systems, pp. 3467-3473, 2009. Article (CrossRef Link)

[22]

C. Yi, “Semantic mapping and active localization for service robot: how to follow human navigation paradigm with alow-grade sensor,” Ph.D. dissertation, Univ. Hanyang, Korea, 2012.

[23] M. Munich, P. Pirjanian, E. Bernardo, L. Goncalves, N. Karlsson, D. Lowe, “SIFT-ing through features with ViPR,”

IEEE Robotics and Automation Magazine, pp. 72-77, 2006. Article (CrossRef Link)

Chuho Yi received a B.S. in the School of Electrical and Computer Engineering from theUniversity of Seoul, Seoul, Korea in 2000 and received his MS and Ph.D. in the Department of

Electronics and Computer Engineering from Hanyang University, Seoul, Korea, in 2002 and

2012, respectively. From 2003 to 2006, he worked as a researcher of the ProductionEngineering Research Institute at LG Electronics, Korea. He is an author of over 10 papers in

refereed international journals and conference proceedings. His research interests include

SLAM, semantic map-building, active localization, navigation, Bayesian models, and robot

vision.

Seungdo Jeong received a B.S. in Electronic Engineering from Hanyang University, Seoul,

Korea at 1999, and earned his M.S. and Ph.D. in Electrical and Computer Engineering fromHanyang University, Seoul, Korea in 2001 and 2007, respectively. From 2009 to 2011, he

served as a full-time lecturer in the Department of Information and Communication

Engineering at Hanyang Cyber University, Seoul, Korea. From 2011, he served as a Research

Professor of the Research Institute of Electrical and Computer Engineering at Hanyang

University, Seoul, Korea. He is an author of over 20 papers in refereed international journalsand conference proceedings. His research interests include multimedia information retrieval,computer vision, multimedia content processing, augmented reality, and tensor analysis.

Jungwon Cho received the B.S. degree in Information & Telecommunication Engineering

from University of Incheon, Incheon, S.Korea at 1996, and earned the M.S. and Ph.D. degrees

in Electronic Communication Engineering from Hanyang University, Seoul, S.Korea at 1998

and 2004, respectively. In 2004, he joined Jeju National University, Jeju, S.Korea, where he

currently is a Professor at the Department of Computer Education and Vice-dean at the

College of Education. He also visited the Purdue University as a Visiting Scholar in 2007-

2008. He is an author of over 20 papers in refereed international journals and conference

proceedings. His research interests include computer education, information ethics,

smart&ubiquitous learning, multimedia information retrieval. He is a member of the IEEE andthe IEICE.

Copyrights © 2012 KAIS

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.161.7306&rep=rep1&type=pdf



http://dx.doi.org/10.1109/ICSMC.2003.1244254



http://dx.doi.org/10.1109/IROS.2009.5354462










map representation for robots

Documents