autonomous robot exploration mote in smart environments ...zanella/paper/cr_2012/ante_cr.pdfe-mail:...

13
Noname manuscript No. (will be inserted by the editor) Autonomous robot exploration in smart environments exploiting wireless sensors and visual features Andrea Bardella · Matteo Danieletto · Emanuele Menegatti · Andrea Zanella · Alberto Pretto · Pietro Zanuttigh Received: date / Accepted: date Abstract This paper presents a complete solution for the in- tegration of robots and wireless sensor networks in an ambi- Andrea Bardella University of Padova, Dept. of Information Engineering, via Gradenigo 6/B, 35131 Padova, Italy Tel.: +39-049-8277770 Fax: +39-049-8277699 E-mail: [email protected] Matteo Danieletto University of Padova, Dept. of Information Engineering, via Gradenigo 6/B, 35131 Padova, Italy Tel.: +39-049-8277770 Fax: +39-049-8277699 E-mail: [email protected] Emanuele Menegatti University of Padova, Dept. of Information Engineering, via Gradenigo 6/B, 35131 Padova, Italy Tel.: +39-049-8277831 Fax: +39-049-8277699 E-mail: [email protected] Alberto Pretto University of Padova, Dept. of Information Engineering, via Gradenigo 6/B, 35131 Padova, Italy Tel.: +39-049-8277831 Fax: +39-049-8277699 E-mail: [email protected] Andrea Zanella University of Padova, Dept. of Information Engineering, via Gradenigo 6/B, 35131 Padova, Italy Tel.: +39-049-8277770 Fax: +39-049-8277699 E-mail: [email protected] Pietro Zanuttigh University of Padova, Dept. of Information Engineering, via Gradenigo 6/B, 35131 Padova, Italy Tel.: +39-049-8277782 Fax: +39-049-8277699 E-mail: [email protected] ent intelligence scenario. The basic idea consists in shifting from the paradigm of a very skilled robot interacting with standard objects, to a simpler robot able to communicate with smart objects, i.e., objects capable of interacting among themselves and with the robots. A smart object is a standard item equipped with a wireless sensor node (or mote) that provides sensing, communication, and computational capa- bilities. The mote’s memory is pre-loaded with object infor- mation, as name, size and visual descriptors of the object. In this paper, we will show how the orthogonal advantages of wireless sensor network technology and of mobile robots can be synergically combined in our approach. We detail the design and the implementation of the interaction of the robot with the smart objects in the environment. Our approach en- compasses three main phases: i) discovery, the robot discov- ers the smart objects in the area by using wireless commu- nication; ii) mapping, the robot moving in the environment roughly maps the objects in space using wireless commu- nication; iii) recognition, the robot recognizes and precisely locates the smart object of interest by requiring the object to transmit its visual appearance. Hence, the robot matches this appearance with its visual perception and reach the ob- ject for fine-grain interaction. Experimental validation for each of the three phases in a real environment is presented. Keywords Wireless Sensor Network · mobile robot · ambient intelligence · Object-Recognition · MoBIF · mapping · localization The final publication is available at www.springerlink.com DOI: 10.1007/s12243-012-0305-z 1 Introduction Service robots are looking for their killer applications to leave research laboratories and enter in our daily lives, be- ing progressively deployed in houses, on roads and in public spaces. The same trend is experienced by Wireless Sensor Network (WSN) technologies, which are breaking through the academic boundaries to spread over the market. The com- plementarities of the WSN and robot technologies result in the synthesis of a novel network paradigm, generally called Wireless Sensor and Robot Networks (WSRN), where the two technologies are intimately interconnected in order to enable a large set of advanced and innovative services [1– 4]. A similar trend is followed by the Networked Robotics community that is investigating the idea of exploiting the communication between the robot and sensors distributed in the environment to lower the computational burden and the intelligence requested to the robot to effectively operate in complex environments. Inspired by the PEIS Ecology developed by Saffiotti et al. [5], where intelligent and complex behaviors are achieved through the cooperation of many simple robots and sensors,

Upload: others

Post on 27-Nov-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Noname manuscript No.(will be inserted by the editor)

Autonomous robot explorationin smart environments exploitingwireless sensors and visual features

Andrea Bardella · MatteoDanieletto · EmanueleMenegatti · AndreaZanella · Alberto Pretto ·Pietro Zanuttigh

Received: date / Accepted: date

Abstract This paper presents a complete solution for the in-tegration of robots and wireless sensor networks in an ambi-

Andrea BardellaUniversity of Padova, Dept. of Information Engineering,via Gradenigo 6/B, 35131 Padova, ItalyTel.: +39-049-8277770Fax: +39-049-8277699E-mail: [email protected]

Matteo DanielettoUniversity of Padova, Dept. of Information Engineering,via Gradenigo 6/B, 35131 Padova, ItalyTel.: +39-049-8277770Fax: +39-049-8277699E-mail: [email protected]

Emanuele MenegattiUniversity of Padova, Dept. of Information Engineering,via Gradenigo 6/B, 35131 Padova, ItalyTel.: +39-049-8277831Fax: +39-049-8277699E-mail: [email protected]

Alberto PrettoUniversity of Padova, Dept. of Information Engineering,via Gradenigo 6/B, 35131 Padova, ItalyTel.: +39-049-8277831Fax: +39-049-8277699E-mail: [email protected]

Andrea ZanellaUniversity of Padova, Dept. of Information Engineering,via Gradenigo 6/B, 35131 Padova, ItalyTel.: +39-049-8277770Fax: +39-049-8277699E-mail: [email protected]

Pietro ZanuttighUniversity of Padova, Dept. of Information Engineering,via Gradenigo 6/B, 35131 Padova, ItalyTel.: +39-049-8277782Fax: +39-049-8277699E-mail: [email protected]

ent intelligence scenario. The basic idea consists in shiftingfrom the paradigm of a very skilled robot interacting withstandard objects, to a simpler robot able to communicatewith smart objects, i.e., objects capable of interacting amongthemselves and with the robots. A smart object is a standarditem equipped with a wireless sensor node (or mote) thatprovides sensing, communication, and computational capa-bilities. The mote’s memory is pre-loaded with object infor-mation, as name, size and visual descriptors of the object.In this paper, we will show how the orthogonal advantagesof wireless sensor network technology and of mobile robotscan be synergically combined in our approach. We detail thedesign and the implementation of the interaction of the robotwith the smart objects in the environment. Our approach en-compasses three main phases: i) discovery, the robot discov-ers the smart objects in the area by using wireless commu-nication; ii) mapping, the robot moving in the environmentroughly maps the objects in space using wireless commu-nication; iii) recognition, the robot recognizes and preciselylocates the smart object of interest by requiring the objectto transmit its visual appearance. Hence, the robot matchesthis appearance with its visual perception and reach the ob-ject for fine-grain interaction. Experimental validation foreach of the three phases in a real environment is presented.

Keywords Wireless Sensor Network · mobile robot ·ambient intelligence · Object-Recognition · MoBIF ·mapping · localization

The final publication is available at www.springerlink.com DOI: 10.1007/s12243-012-0305-z

1 Introduction

Service robots are looking for their killer applications toleave research laboratories and enter in our daily lives, be-ing progressively deployed in houses, on roads and in publicspaces. The same trend is experienced by Wireless SensorNetwork (WSN) technologies, which are breaking throughthe academic boundaries to spread over the market. The com-plementarities of the WSN and robot technologies result inthe synthesis of a novel network paradigm, generally calledWireless Sensor and Robot Networks (WSRN), where thetwo technologies are intimately interconnected in order toenable a large set of advanced and innovative services [1–4]. A similar trend is followed by the Networked Roboticscommunity that is investigating the idea of exploiting thecommunication between the robot and sensors distributed inthe environment to lower the computational burden and theintelligence requested to the robot to effectively operate incomplex environments.

Inspired by the PEIS Ecology developed by Saffiotti etal. [5], where intelligent and complex behaviors are achievedthrough the cooperation of many simple robots and sensors,

Page 2: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

2 Andrea Bardella et al.

we here propose a system that combines in a synergetic waythe complementarities of the robot and WSN technologiesto enable the autonomous exploration of unknown intelli-gent environments by the robots. The key ingredient of oursystem is the presence in the environment of objects taggedwith wireless sensor nodes, or motes for shortness, whichprovide communication, processing, and data storage capa-bilities, thus turning a dummy object into a smart object.The same mote is applied to the robot, in order to enableradio communication with the smart objects. In our solu-tion, the robot does not have any prior knowledge aboutthe shapes of the objects, their positions and their number:all these informations are obtained by interacting with theobjects. Therefore, conversely to most of the solutions pro-posed in the literature, we can deal with any object from thesimplest to the most complex (like those in Fig. 1), withoutnecessity of any a priori knowledge about the objects.

We foresee the application of this system in many dif-ferent scenarios, including industrial, domestic, and assistedliving. As an example, teams of robots may be deployed inautomated warehouses to catalog, localize and retrieve ob-jects upon request, without knowing in advance the type northe location of the objects to be collected. Similarly, robotsmay be used in libraries to put back books on the shelvesafter closure. In a home scenario, the same system can beused to tidy-up the play-room of kids. At the end of the day,a robot can locate all toys in the room and store them backinto the proper containers, provided both toys and contain-ers are smart objects. In the assisted living domain, we mayenvision a system where an intelligent portable device maydrive a visual-impaired person through a partially unknownenvironment by vocally describing the smart objects recog-nized along the path. The audio description of the smart ob-jects may be embedded in the motes’ memory by the ob-jects manufacturer and wireless transmitted to the portablesmart speakerphone upon request. Yet, a personal robot maydisplay to a motion-impaiered user the list of smart objects(remote controls, mobile phones, books, and so on) that itdiscovers in the environment thanks to wireless and, possi-bly, multihop communication. Then, the robot may go andgrab an object for the user.

The realization of this vision requires the robot to beable to identify, locate and, finally, recognize the differentobjects in the environment. Whereas the radio communica-tion may be successfully exploited to discover which smartobjects are located in the environment, the standard local-ization schemes based on the radio signal strength (RSS) donot provide a precise geographical location of the nodes [6].This is particularly true in indoor environment, because ofthe self interference effects due to multiple reflections of theradio signals. Currently, the most popular RSS-based local-ization algorithm can locate the motes with respect to therobot with a precision of a meter or so, that is not enough

for a reliable localization of the object based only on its ID.We hence propose to complement the radio-based localiza-tion with a visual detection of the object by the robot. Tothis end, we store locally, in the memory of the smart ob-ject, the appearance of the object itself. This information isthen transmitted to the robot upon request, so that the robotcan visually recognize that object in the image taken by theon-board camera.

Object recognition from visual features is a well knownproblem in computer vision for which many different solu-tions have been proposed, based both on the analysis of theglobal appearance of the object and on the extraction of arelevant set of features. Concerning the first approach oneof the most popular approaches was proposed by Viola andJones [7], where an Adaptive Boosting (AdaBoost) learn-ing algorithm is applied to a set of Haar-like features effi-ciently extracted from images. Lowe [8] proposed an objectrecognition system based on local image features (SIFT) in-variant to image scaling, translation, and rotation. SIFT fea-tures have proved to be very robust and effective in manypractical object recognition applications. In the last years,bag-of-words (BoW) methods have received a great atten-tion for object recognition and categorization tasks (e.g.,[9]): these methods aim to represents images with a set ofclusterized visual descriptors (e.g. SIFT features) taken froma visual vocabulary generated offline. Recently, BoW ap-proaches have been improved by indexing descriptors (vi-sual words) in a vocabulary tree with a hierarchical quan-tization of the features descriptors [10] and using quantiza-tion methods based on randomized trees [11]. Despite theirefficiency and effectiveness, BoW methods require an ac-curate off-line learning step. Moreover, all the object sig-natures (i.e., collections of visual words that describes theobjects) must share the same dictionary: this dependencylimits somehow the portability of these methods. For thesereasons, we choose to employ an object recognition systemsimilar to the one proposed in [8], successfully exploited inother recent robotic recognition frameworks (e.g. [12]).

We observe that some authors proposed the use of RFIDtechnology to ease the recognition of objects by the robot.Although the use of RFID technology may boost the coop-eration and interaction between object and robot, its scopeis limited by the intrinsic constraints of the RFID technol-ogy, such as the rather short operational range (especiallyfor the passive RFIDs) and the extremely limited computa-tional capabilities of the RF tags. Moreover, the use of activemotes may also provide measurements involving either theobject itself, like object temperature, level of filling, inclina-tion, weight, deterioration, or the surrounding environment,as temperature, pollution, humidity, light, and so on. Thesmart objects may form a self-established multi-hop wire-less network that can be used to enlarge the sensorial andcommunication range of the robot, augmenting its environ-

Page 3: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Autonomous robot exploration in smart environments exploiting wireless sensors and visual features 3

Fig. 1 The robot and some of the smart objects used in the presentedexperiments. In order to demonstrate the flexibility of the system wechose objects that usually people move around in a room.

mental awareness. The synergetic combination of the twosystems is finally realized by means of a suitable suite ofcommunication protocols and algorithms, whose aim is toenable the interaction of the robot with the smart objects lo-cated in the area.

In summary in our approach, the robot can create a mapwith a coarse localization of the smart objects of interestobtained via radio and move toward them. When the robotis in the room where the object of interest is located, it canseek for its appearance in the images it has acquired. In thisway, the robot can locate much more precisely the object andnavigate toward it in order to perform the desired interaction.This process develops along three main phases:

– Object Discovery: the smart objects, which are usuallydozing to save energy, are waken, identified and timesynchronized by the robot by using the radio interface.

– Object Mapping: the robot starts navigating the environ-ment and uses the information provided by the onboardodometers and the radio signals received from the smartobjects to map the objects in space.

– Object Recognition: once the smart objects’ mapping issufficiently accurate, the robot can move toward the es-timated location of a target object. When in proximity,the robot asks the smart object to send its visual ap-pearance descriptors, which are continuously matchedagainst those extracted from the stereo camera images.When the matching exceeds a given threshold, the ob-ject is recognized in the images taken by the onboardcameras of the robot that can then exactly localize it inspace, approach it and start fine-grain interaction.

In the following of this paper we describe in greater de-tail the algorithms that we used to realize the Object Discov-ery, Object Mapping and Object Recognition services. Weobserve that these services can actually be realized by usingstate of the art solutions taken from the WSN, autonomous

robotic or WSRN areas. However, the approaches proposedin the literature generally fail to catch the potential syner-gies among the three different phases, in particular the ob-ject mapping and recognition, whereas in this manuscript weoffer a more complete and organic vision of the system as awhole. Furthermore, we present a large selection of experi-mental results obtained by using a proof-of-concept testbedthat we developed to prove the feasibility of the idea and toidentify pitfalls and technical challenges.

Summing up, the strength of the proposed approach withrespect to the state of the art consists in the following mainpoints. i) We take the concept of smart object in the WSRNpicture, which is a common object tagged with a mote thatstores object-related information, such as weight, size, ap-pearance, status and so on. This information can be wire-lessly transmitted to other nodes, thus enabling the seam-less inclusion of new objects into the system, without theneed for database updating or similar operations that areinstead required in classical object-recognition systems. ii)The use of motes to realize the smart objects rather thanpassive RFIDs makes it possible to establish a multi-hopwireless network that, besides providing the usual WSN ser-vices, may relay messages from remote nodes to the robotin case direct communication is not available. This featureimproves the flexibility and robustness of the system withrespect to passive RFID solutions. We observe that, con-cerning the communication and computational capabilities,WSNs and active RFID are rather similar. Nonetheless, theWSN technology natively supports environmental monitor-ing functionalities that can integrated into the system in dif-ferent ways. For instance, the data collected by light sensorscan be used to suitably tune the sensitivity of the camera tothe environmental brightness or to select the best sets of im-age features to be transmitted to the robot. iii) Complement-ing the localization information obtained by radio-based lo-calization schemes with information extracted from the vi-sual appearance of the smart objects improves the accuracyof the discrimination capability of the robot in presence ofsimilar objects, the accuracy in the localization of the objectswith respect to the robot, and finally enhances its capabilityof interaction with the objects.

Part of the results presented in this manuscript appearedin the proceedings of some international conferences, namely[13–15]. In this paper, however, we offer a more completeand organic vision of the system as a whole, which wasmissing in each of the previous publications. Furthermore,we detail the communication protocol between smart objectsand robot and propose an innovative multichannel strategyfor radio-based localization. Moreover, we assess the im-provements with respect to classical single-channel meth-ods by providing extensive experimental results and, as aside result, we compare the localization techniques consid-ered in our previous publications [14,15] with another al-

Page 4: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

4 Andrea Bardella et al.

gorithm, based on the multi-dimensional scaling technique.For this last technique, we also investigate the accuracy gainobtained by considering inter-object measurements into thelocalization algorithm, which was not presented in our pre-vious publications. Finally, we extend the analysis of thesmart object visual recognition by means of MoBIFs, as pre-sented in [13], by introducing a stereoscopic visual recog-nition setup that makes it possible to estimate the objectdistances by applying triangulation on the correspondencesfound in MoBIF descriptor clouds.

The rest of the paper is organized as follows. In Sec-tion 2 we describe the object discovery process. In Section 3we discuss the object mapping problem and we present themultichannel ranging technique and the communication pro-tocol that we designed to coordinate the nodes during thisphase. Furthermore, we describe the multi dimensional scal-ing method for objects mapping. In Section 4, we describethe object appearance descriptor method we used to real-ize the visual object recognition module. In Section 5, wepresent experiments in which the robot performs all the stepsto discover and approach the smart objects. Finally, in Sec-tion 6 we draw conclusions and discuss future extensions ofthe work.

2 Object Discovery

As we said the number of smart objects and their positionin the environment are initially unknown to the robot. Therobot thus needs first to discover and, then, localize the ob-jects in the environment.

The problem in objects discovery is that motes are not al-ways awake and ready to reply to inquiry messages. In fact,the radio transceiver is the most energy-hungry componentof a sensor node, so that motes generally operate accordingto a periodic ON-OFF pattern: in ON periods, all the sensingand communication capabilities are active, whereas in OFFperiods the radio transceiver and, possibly, other hardwaremodules are powered off to save energy. Hence, a node isactive for only the fraction of time d, typically referred to asduty cycle. The ON-OFF patterns followed by the differentnodes are, in general, asynchronous, since achieving timesynchronization in multi-hop WSN requires rather sophisti-cated algorithms (see, e.g., [16]). Clearly, the duty cycle dand the ON-OFF cycle period have a direct impact on thenode’s lifetime, i.e., the time before a sensor node exhaustsits battery charge, and on the reactiveness of the node to theoccurrence of events or solicitations by other nodes. Thesetwo aspects are obviously in contrast and striking the mostconvenient balance between them is a subtle design problemthat has stimulated different solutions (see, for instance, [17]and references therein). In our scenario, however, the nodediscovery problem is greatly simplified by the presence ofthe robot that can keep always active its wireless interface,

thus intercepting any transmission by other nodes in its cov-erage range. This feature makes it possible to adopt simplerendezvous strategies that, on the one hand, allow motes toimplement ON-OFF patterns with minimal duty cycles, thussaving energy and, on the other hand, provide quick reactiontime, compatibly with the length of the cycle time Tc.

More specifically, in this work we propose the followingObject Discovery strategy, which is graphically exemplifiedin Fig. 2. The motes attached to the smart objects can operatein two states, namely Active and Quiet. In Active state, thesmart objects keep their transceiver switched on and readyto receive command messages, whereas in Quiet state eachmote follows a periodic ON-OFF pattern. In the ON phase,the mote switches on its radio interface and broadcasts aHELLO message containing basic information about the at-tached object. Channel access is managed according to theclassic CSMA algorithm, so that the message transmissionis deferred to a later time if the channel is occupied by othertransmissions. The ON phase ends when the channel re-mains idle for a certain time interval Tlisten. During the ONperiod the mote will process all the received packets and actaccordingly.

When a robot wishes to discover the smart objects inits proximity, it switches on its radio interface and contin-uously listens for HELLO messages sent by nearby objects.The robot retrieves and stores the MAC address, the objectprofile and the other info contained in the received HELLOmessages. Furthermore, the robot replies to each messageby sending a SYNC message within the time interval Tlisten.The SYNC message contains two main fields, the TemporaryAddress, and the Rendezvous Time. The Temporary Addressis a short identifier that the robot assigns to the addressedsmart object and that will be used in subsequent commu-nications to refer to that object. The Rendezvous Time, in-stead, denotes the time interval after which the robot expectsthe smart object to be in Active state. The smart object thatreceives the SYNC message, then, stores its Temporary Ad-dress and schedules the transition from Quiet to Active modeafter the Rendezvous Time interval. The Rendezvous Timevalue is updated at each SYNC message in order to synchro-nize the activation of the smart objects around the same timeinstant.

The communication protocol is unreliable, since it doesnot entail any explicit acknowledgment mechanism: if theSYNC message is not correctly decoded by the addressedsmart object, the object will stay in Quiet state and keepperforming the ON-OFF cycle. In turn, the information col-lected by the robot are stored in a soft form and will bedeleted after a certain time period if not refreshed by thereception of new messages from the corresponding smartobject. However, the robot can reply to HELLO messagesat any time, even when performing other tasks than Object

Page 5: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Autonomous robot exploration in smart environments exploiting wireless sensors and visual features 5

Discovery, in order to identify objects that were initially outof the robot’s range.

QUIET STATE ACTIVE STATE

Object A

ON

OFF

ON OFFTRV

Object B

ON OFFTRV

ROBOT

Start object discovery

Send SYNCTmpAddr =1RV=TRV

Store TmpAddr=1RV=TRV

Send SYNCTmpAddr =2RV=TRV

Store TmpAddr=2RV=TRV

time

time

timeSend RSSI_GETTmpAddr=1TmpAddr=2NEXT RF CH=2

OFFSendRSSI_REPORTNEXT RF CH=2

SendRSSI_REPORTNEXT RF CH=2

TransmissionReception

Idle

Fig. 2 Example of Object Discovery procedure, followed by part ofRSSI harvesting procedure.

3 Objects mapping

The Object Discovery procedure provides the robot with in-formation concerning the objects in its coverage range. Thefollowing step is to map the objects in the area, i.e., de-termine their geographical coordinates, in order to enablefine-grain interaction. The problem of nodes localization inWSN has been since long recognized as an important andchallenging issue and lot of research has been carried outin this context. Many solutions assume the presence of alimited number of nodes, called beacons or anchors, thatknow their own position and are used by other nodes to lo-cate themselves through triangulation techniques. Many ofthese schemes make use of the Received Signal Strength In-dicator (RSSI) to determine a rough estimate of the distancebetween transmitter and receiver, an operation referred to asranging. This approach offers the advantage of being read-ily employable in any radio device, since the RSSI is sup-ported by basically all radio transceivers. Another advantageof RSSI-based ranging is that it does not require the nodeto be in line of sight with the robot, since the radio signalpasses through obstacles, as people, furniture or even walls.Unfortunately, the range estimate based on RSSI measure-ments is unreliable and subject to random fluctuations dueto a number of environmental factors. Therefore, the accu-racy that can be obtained with RSSI-based localization tech-niques in indoor environments is rather poor, with errors ofthe order of 1 to 6 meters [6], depending on the number ofbeacons.

The presence of the robot, however, can drastically en-hance the performance of the localization techniques. For

instance, in [18] we showed that the robot, which is fairlywell localized by virtue of the on-board odometers and nav-igation system, can act as a sort of mobile beacon drasti-cally augmenting the number of reference signals to be usedin classical localization algorithms. However, the presenceof robots opens the way to much more advanced and so-phisticated localization methods. A very flexible and pow-erful technique is the Simultaneous Localization And Map-ping (SLAM) algorithm realized by means of an ExtendedKalman Filter (EKF) approach that merges the informationprovided by the robot’s odometers with the RSSI samplesprovided by the surrounding objects to simultaneously trackthe motion of the robot in the environment and refine themapping of the objects in the area. We experimented thisapproach in [14] and observed that the accuracy of the map-ping provided by EKF-SLAM is strongly affected by theinitial guess of the mote position, which is required at thebeginning of the SLAM procedure to initialize the systemstate Θ, which is a vector containing the current estimateof robot and objects locations. Therefore, in [15] we pro-posed to couple the EKF-SLAM algorithm with a mote posi-tion initialization based on particle filters. According to ourexperimental results, this approach reduces both mean andvariance of the final location estimate error with respect tothe simple EKF approach.

In this work we further advance the investigation of theobject mapping problem along three directions: first, we pro-pose a method to increase the accuracy of the RSSI-basedranging by exploiting the capability of the sensor nodes tooperate on different RF channels;1 second, we include inour experiments another localization method, namely theWeighted Multi-Dimensional Scaling (MDS), that is com-putationally lighter than EKF and is based on the same en-gine used for the object recognition functionalities that willbe described in the next section; third, we evaluate the extentto which inter-object RSSI measurements may amelioratethe objects mapping. As a side result, in Sec. 5 we providean experimental performance comparison among the variousmapping techniques considered in our work, namely EKFwith delayed initialization, Particle Filter only (PF), MDS,and MDS with inter-object measurements.

In the following of this section we explain the princi-ple of multichannel ranging and describe the communicationprotocol we designed to collect RSSI samples over differ-ent RF channels. Successively, for reader convenience, weoverview the basics of the MDS approach that was proposedby Costa et al. in [20], whereas for the details of the otheraforementioned localization techniques we refer the readerto our previous publications [15,14].

1 Some preliminary results obtained with this method were pre-sented in [19].

Page 6: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

6 Andrea Bardella et al.

3.1 Multichannel RSSI-based ranging

The most widely used RSSI-based ranging model is basedon the well-known path loss plus shadowing signal propa-gation model, according to which the received power at dis-tance d from the transmitter can be expressed (in dBm) as

Prx = Ptx +K − 10η log10

(d

d0

)+ Ψ , (1)

where Ptx is the transmitted power in dBm, K is a unitlessconstant that depends on the environment, d0 is the refer-ence distance for the far field model to be valid, η is thepath loss coefficient and Ψ is a random variable that takesfading effects into account [21]. In general, the Ψ term is as-sumed to be a zero-mean Gaussian random variable, thoughthis model is not always the most appropriate, especially inpresence of line of sight (LOS) between transmitter and re-ceiver [22]. In this case, in fact, the variability of the receivedpower is mainly due to the random phase shifts between thedirect path and the strongest reflections, typically on floor,ceiling and close-by objects. At the frequency of 2.4 GHz,which is typically used by sensor nodes, moving transmitteror receiver of few centimeters can result in a totally differ-ent combination of the signal reflections at the receiver, witha significant variation of the received signal power. We ob-serve that the same effect can be obtained by changing thecarrier frequency without moving the nodes. As an example,by increasing the carrier frequency of the radio signal from2.4 GHz to 2.45 GHz, the phase shift between the direct sig-nal and a copy that follows a path 3 meters longer (e.g., ceil-ing reflection) will be≈ π. This suggest that it is possible toreduce the impact of Ψ in (1) by collecting RSSI samples ondifferent RF channels and, then, using their mean value Prxin the ranging equation:

d = d010Ptx+K−Prx

10η . (2)

Clearly, to gather RSSI measurements on different chan-nels, nodes need to coordinate in order to concordantly changethe carrier frequency. To this end, we designed the follow-ing protocol, which is initiated by the robot when the objectscontacted during the Object Discovery phase enter Activemode at the rendezvous time.

The multichannel RSSI harvesting process occurs in suc-cessive rounds. Each round is initiated by the robot that broad-casts an RSSI GET message. This message contains the listof smart objects that are required to collect RSSI samples,and the transmission order of the nodes. For compactness,nodes are identified by means of the Temporary Addressesassigned during the Object Discovery phase, rather than us-ing the MAC addresses, which are typically longer. Channelaccess occurs according to a Time Division Multiple Access(TDMA) scheme: time is partitioned in transmission slotsof constant duration (slightly longer than the transmission

time of a full data packet), and each node is assigned to asingle slot in an exclusive manner. Each node listed in theRSSI GET message, then, waits for its assigned slot and,then, broadcasts an RSSI REPORT message that containsthe vector of RSSI values collected in the previous slots,included the robot’s one. Furthermore, the RSSI GET andRSSI REPORTmessages will also carry an indication of theRF channel that will be used in the following round. In thisway, nodes that miss the robot’s packet, but overhear a re-port message can synchronize again in the following round.We observe that the number of RSSI samples reported bythe nodes is not homogeneous across the round, since thefirst nodes that transmit have not yet received messages fromthe others. To overcome this drawback, the robot may per-mute the transmission order of the nodes in each subsequentrounds. Furthermore, each round may be repeated multipletimes, without changing channel.

Once again, the communication is unreliable and no ACKmechanism is considered. If a smart object fails replying tothe robot’s message for two consecutive rounds, its entry isdeleted in the robot’s memory and the Temporary Addressis released. On the other hand, if a smart object does not re-ceive any message (either from the robot or other objects)for a interval Ttimeout, it switches back to the Quiet mode.

When the multichannel RSSI harvesting is complete, therobot can move into a new location and repeat the full pro-cess.

3.2 MDS

In the following we describe the MDS algorithm for objectmapping that we used in our experiments. Let us enumeratefrom 1 to n the smart objects included in the mapping pro-cess. Furthermore, let n + 1, n + 2, . . . , n + k denote thelocations where the robot stopped to collect RSSI samplesfrom the surrounding objects. In the following, we denotethese positions as virtual beacon nodes. Let θi = (xi, yi)T

the vector of cartesian coordinates for node i. Our aim is todetermine an estimate of θi for i = 1, 2, . . . , n knowing theexact position of the virtual beacons and the ranging valuesgiven by (2). The MDS approach consists in minimizing thefollowing cost function

S(Θk) =n∑i=1

n+k∑j=n+1

2wi,j(di,j − di,j(Θk)

)2

(3)

where Θk = [θ1, . . . , θn+k] is the state vector, di,j is theestimated distance between smart object i and virtual bea-con j, whereas di,j(Θk) is the distance between the samenodes given the state vector Θk. Finally, the scalar wi,j =

e−P 2

rxi,j/P 2th accounts for the accuracy of di,j where Prxi,j

and Pth are respectively the power received by node i from

Page 7: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Autonomous robot exploration in smart environments exploiting wireless sensors and visual features 7

node j , averaged over different channels, and the powerthreshold for ranging. The cost function S(Θk) can also bemodified to include the measurements between smart ob-jects in the following way:

S(Θk) =n∑i=1

(n∑j=1j 6=i

wi,j(di,j − di,j(Θk))2+

n+k∑j=n+1

2wi,j(di,j − di,j(Θk))2

).

(4)

The minimization of S(Θk) cannot be performed in closedform, but the problem can be solved iteratively. Given thestate vector at the iterative step h,Θ(h)

k =[θ

(h)1 , . . . , θ

(h)n+k

],

the next state can be computed by applying this simple up-dating function (see [20] for the details):

θ(h+1)i = aiΘ

(h)k b(h)

i (5)

where

ai =

n∑j=1j 6=i

wi,j +n+k∑j=n+1

2wi,j

−1

(6)

and b(h)i =

[b(h)i,1 , . . . , b

(h)i,n+k

]Tis a vector whose entries

are given by:

b(h)i,j =

αwi,j

(1− di,j

di,j(Θ(h)k )

)j 6= i

n∑`=1` 6=i

wi,`di,`

di,`(Θ(h)k )

+n+k∑`=n+1

2wi,`di,`

di,`(Θ(h)k )

j = i ,

(7)

with α = 1 if j ≤ n and α = 2 otherwise. The iterativeprocedure stops when S(Θ(h−1)

k ) − S(Θ(h)k ) < ε for a cer-

tain ε. We observe that, although the updating equations aresimple to compute, the number of operations grows linearlywith the number of virtual beacons, so that the execution ofthe MDS algorithm progressively slows down as the numberof sampling positions increases. The same scalability prob-lem, however, affects the other localization algorithms con-sidered in our previous work. In particular the complexityof EKF is roughly order of O(mn3), with m number ofsteps of the robot and n number of objects in the area, whilethe complexity of the MDS algorithms is instead O(nmL),where L is the number of recursions performed by the algo-rithm to converge to the solution. The value of L grows withthe number of sampling positionsm, though the dependenceof L on m is not available in an explicit form. Nonetheless,we experimentally found that MDS is lighter than EKF forreasonable values of n and m.

4 Visual object recognition and interaction

The mapping algorithm described in the previous section isgenerally capable of localizing the smart objects in the envi-ronment with a residual error of the order of magnitude of ameter. Although this precision may be sufficient to correctlysteer the robot towards a target destination, it is not enoughfor enabling physical interaction between the robot and theobject. To this end, we need a much more precise localiza-tion of the object, which can be obtained by recognizing itsappearance in the images provided by the robot’s camera.Operatively, when the robot is in the surrounding of the ob-ject of interest, it starts sending Descriptor Requestmessages to that object. The object replies by sending pack-ets containing descriptors of its appearance, which are passedto the object identification module in the robot. Note howeach objects stores only its own descriptors, so the memoryrequirmentes for the motes are very limited and can fit insidethe used motes’ memory. The robot controller then executesthe following steps:

– gets images from the on-board cameras;– extract the descriptors of these images;– compare these descriptors with those transmitted by the

object’s mote;– if the object is recognized in the camera images, its po-

sition is computed from the descriptor’s locations and ispassed to the robot navigation module.

Clearly, the performance of the object recognition mod-ule depends on the method used to represent the object ap-pearance. Ideally, the descriptors shall permit the robot toperform a fast and reliable recognition of the object in its vi-sual perspective, irrespective of its distance, orientation andlight exposition. The robot should also be able to recognizethe object even if it is partly occluded by other elements ofthe scene. Many of these features are possessed by the Scale-Invariant Feature Transform (SIFT) descriptors [8]. Further-more each SIFT feature descriptor occupies just 128 bytes ofmemory and, thus, the set of feature descriptors correspond-ing to the object can be stored in commercial motes. Theextraction of SIFTs from the images grabbed by the cam-era and their comparison with other descriptors taken froma sample image is also fast enough to allow object recog-nition in about a second.2 Moreover, extracted features areparticularly robust to affine transformations and occlusions.Unfortunately, in indoor environments, most of the time theambient is dim and the light is not enough for grabbingclear images. In this case, the images taken by the robotwhile moving will likely be affected by motion blur. To solvethis problem, we propose the use of a new feature detec-tor scheme called Motion Blur Invariant Feature detector

2 This value refers to the SIFT feature extraction and matching on an1032×778 image using the hardware platform described in Subsection5.1.

Page 8: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

8 Andrea Bardella et al.

(MoBIF) that was originally developed for humanoid robots[23]. Instead of trying to restore the original, unblurred im-ages, the MoBIF approach uses an adapted scale-space rep-resentation of the image that tries to overcome the negativeeffect of the motion blur in the invariant features detectionand description process. This approach can deal also withnon-uniform, non-linear motion blur. Like SIFT, MoBIF de-scriptors are particularly robust to affine transformations andocclusions, thus allowing our approach to work correctlyeven in the case of partially occluded objects. Furthermore,MoBIF descriptors proved to perform similarly to SIFT onstandard datasets and outperformed SIFT in images affectedby motion blur or in dim images.

Unfortunately, MoBIF descriptor (like SIFT) are not ro-bust to large perspective transformations nor to large rota-tions of the object along the vertical axis that occur whenthe robot observes the objects from very different points ofview (as reported in [8] performances decrease when the ro-tation is larger than 30 degrees). Moreover, SIFT and MoBIFdescriptors show a good invariance to the scale only untila certain limit (in the order of a couple of meters for theobjects considered in this study). We addressed these twoproblems by taking many pictures of each object from dif-ferent points of views, separated by 20 ÷ 30 degrees (thusensuring that there is always a view corresponding to a rota-tion for which SIFT descriptors work properly) and at sev-eral distances (i.e. 1 m, 3 m, and 5 m). We then extractedthe descriptors from each image and added each of them ina single descriptor cloud, in an incremental way. If a de-scriptor is too similar to another descriptor already presentin the cloud, it is rejected. For example in our experimen-tal setting we took images of each object from 18 differentviewing directions and at three different distances. For eachimage, furthermore, we selected approximately the 10 mostinformative MOBIFs, so that the total memory footprint ofthe object description is at most about 70 kB, which fits inthe flash memory of 1 MB of TmoteSky nodes. The set ofdescriptors resulting from this process completely describesthe external surface of the object. Searching for objects in aframe is hence done by looking for correspondences on thisset. We remark that merging the information coming fromall the pictures and deleting redundant descriptors makesit possible to recognize the object from any point of view,while maintaining a reasonable computational complexity.

Note how visual recognition is also very useful in orderto distinguish between different objects placed too close tobe distinguished from the RSSI technique: MoBIF descrip-tors have a very good matching accuracy and are able todistinguish between two not too similar objects. The onlycritical case is when almost identical objects are placed veryclose and the object recognition algorithm may fail becausethe visual features of the objects are very similar.

In order to improve the precision of the robot localiza-tion we also tested an improved version of the visual recog-nition system where a second camera has been added to theproposed setup. Each robot is so provided with two cam-eras arranged in a binocular stereo setup. This not only al-lows to improve the recognition performance but also, aswell known from computer vision theory, allows to computethe object distance by triangulation from the positions of itsfeatures in the images of the two cameras.

Matching

Object'sMoBIF

Object descriptorsin view 1

MoBIF extraction

Camera 1image Rectification

MatchingMoBIF extraction

Camera 2image Rectification Object descriptors

in view 2Matching

Robustestimator Triangulation

Object-robot distance

Feature-robotdistance

Matching

Fig. 3 Architecture of the stereoscopic feature extraction system

We exploited the stereoscopic camera setup during theautonomous navigation using the approach depicted in Fig. 3.First of all, before starting the autonomous navigation thetwo cameras are jointly calibrated, a rectification transformbetween the two images is computed [24] and the resultingrectification map is stored on the robot. This allows to rectifyin real time the images acquired by the cameras during thenavigation using the pre-computed map in order to make thedescriptor matching easier. Then, as previously introduced,when the robot comes close to an object the MoBIF descrip-tors are extracted from the rectified images of both camerasand the descriptors extracted from each of the two camerasare matched with the object ones. Note that some of the ob-ject descriptors will be present in the images of both cam-eras while other ones could appear just in one of them be-cause they could be out of the field of view, occluded byother objects or simply not detected by the feature extrac-tion algorithm. Following this observation a first advantageof using two cameras is that it is possible to get more fea-ture points and obtain a slight improvement in the objectdetection performance. However the main improvement of-fered by the stereo setup is the additional information onthe features locations. More precisely, as well known fromcomputer vision, the projection of the same 3D point to twodifferent views corresponding to a pair of cameras is shiftedby an amount inversely proportional to the distance betweenthe object and the cameras. In particular in the simple con-figuration of Fig. 4 (referring to rectified images) it can beeasily shown that the relation between the object distanceZ and the position shift of the feature location between theview of the first camera and of the second camera is given

Page 9: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Autonomous robot exploration in smart environments exploiting wireless sensors and visual features 9

by [24]:

Z =(C1 −C2)fd2 − d1

=bf

d2 − d1(8)

where b = (C1 − C2) is the distance between the twocameras’ optical centers C1 and C2 (stereo system base-line) and f is the cameras’ focal length (we used the samefocal length for both cameras).

Z

C1 C2

P

f

d2d1 b

Fig. 4 Geometry of a binocular stereo setup

After this computation for each couple of correspond-ing descriptors we have a distance measure between the fea-ture point and the robot. The usefulness of this informationis twofold: firstly all these distance values can be used in-dependently when the robot is very close to an object tohelp the robot in its interaction with the object. Further-more when the robot is coming in proximity of one of theobjects it is also possible to estimate the distance betweenthe robot and the object by taking the average of the dis-tance between the various features on that object and therobot. This corresponds to assume that all the features arelocated in the object centroid, thus introducing an error inthe actual computed distance, however, at least for smallobjects, this approximation is reasonable. In computing thedistance we also used the RANSAC [25] robust estimatorin order to exclude from the computation outliers due to in-correct matches that can arise because of errors in the fea-ture matching. These mismatches could appear, for exam-ple, if the object has simmetrical or repeating patterns and apoint seen from one of the cameras is matched with anotherone in the other camera’s image corresponding to anotherinstance of the same pattern or if some features of the objectget matched with others not belonging to it. The estimate ofthe distance between the robot and the centroid of the objectobtained from the stereo setup can then be used as an ini-tialization information for the WSN localization algorithmof Sec. 3 when the robot starts moving again towards a newobject.

5 Experiments

In order to prove the feasibility of the proposed system andidentify drawbacks and problems, we developed a proof-of-concept prototype that we used to run some experiments.Below we briefly describe the platform and present a selec-tion of results.

5.1 The hardware platforms

Smart objects have been realized by glueing TmoteSky wire-less sensor nodes to sample items. The TmoteSky radio trans-ceiver is the Chipcon CC2420, whose PHY and MAC iscompliant to the IEEE 802.15.4 standard, operating in ISMband at 2.4 GHz and providing a bit rate of 250 kbit/s. Themodule also provides an 8-bit register named Received Sig-nal Strength Indicator (RSSI), whose value is proportionalto the power of the received radio signal. The core of themote is the MSP430, a Texas Instrument low-power micro-controller, which is used to control the communication andsensing peripherals. The microcontroller is provided with10 kB of RAM and 48 kB of integrated flash memory, usedto host operating system and programs, whereas additional1 MB of flash memory is available for data storing. Besides,the board is equipped with integrated light and humiditysensors. Motes have been programmed in NesC, using theTinyOS open-source operating system.

The robot, named Bender, was a custom-built wheeleddifferential drive platform based on the Pioneer 2 by Mo-bileRobots Inc, depicted in Fig. 1. The robot is equippedwith a standard ATX motherboard with an 1.6 GHz IntelPentium 4, a 256 MB RAM and a 160 GB hard disk, runningLinux OS. The only on–board sensors are a stereoscopiccamera and the odometers connected to the two driven wheels.Communication with the laboratory Intranet is provided bya PCMCIA wireless ethernet card, whereas the connectionwith the WSN is obtained by a Tmote Sky connected to oneof the robot’s USB ports.

5.2 Objects mapping experiments

In this section we compare the performances of the three lo-calization algorithms introduced in Section 3, namely EKF,PF, MDS, in terms of mean localization error. We also con-sidered the MDS with inter-object RSSI measurements, heredenoted as MDS Internode. In Fig. 5 and Fig. 6 we report theresults obtained in indoor (IN) and outdoor (OUT) scenar-ios, respectively. Images (a) show the location of the nodesin the experiments and the trajectory followed by the mobilerobot across the area. Each RSSI harvesting station along thepath is marked by a cross. Figures (b) and (c) show the finallocalization error of each node for the different algorithms,

Page 10: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

10 Andrea Bardella et al.

when using single-channel (b) and multichannel ranging (c),respectively. Furthermore, in In Table 1 we collect the meanlocalization algorithm over all the nodes, in the differentcases.

First of all, comparing the results achieved with the samesetting in the two scenarios, we observe that all the algo-rithms provide better location estimate in outdoor, becauseof the less severe multipath fading. Second and more inter-esting, observing the results reported in figures (b) and (c) inthe respective scenarios, we see that in almost all the casesthe localization error of all the considered algorithms is re-duced when using multichannel ranging rather than single-channel ranging. This experimental evidence confirms theintuition according to which averaging the RSSI samplesover multiple channels reduces the uncertainty of the rang-ing estimate. The counterpart is that the collection of RSSIsamples over multiple channels requires a more sophisti-cated communication algorithm and, in general, may take alonger time. However, we observe that with single-channelranging, it is still necessary to collect multiple RSSI sam-ples for each pairs of nodes, in order to average the fastfading term. Conversely, with multichannel ranging we col-lect one or a few RSSI samples in each RF channel, butwe repeat the operation in successive time instants in dif-ferent channels, so that the fast fading is still averaged outwhen taking the mean RSSI value. Therefore, at the end themultichannel RSSI ranging takes approximately the sametime as single-channel ranging. In particular note how thetime taken to collect RSSI samples over k different chan-nels can be roughly estimated as M = k (nT + S), wheren is the number of in-range nodes, T is the slot durationand S is the switching delay that accounts for the time takenby the nodes to switch to the next channel and receive thenext RSSI GET packet from the robot. With the TmoteSkysensor nodes we used, the slot time turns out to be approx-imately equal to T ' 10 [ms], while the switching time isS ' 50 [ms]. Hence, collecting RSSI samples over k = 4maximally spaced-apart RF channels from n = 10 nodestakes approximately M = 600 [ms].

Finally, we note that the MDS Internode algorithm yields,in a few cases, slightly worse localization accuracy than thestandard MDS. In the other cases, however, the MDS Intern-ode scheme may provide significant improvements as, forinstance, for nodes 2 in Fig. 5. The reason is that rough in-ternode ranging estimates may generally impact negativelyon the localization accuracy provided by the MDS algorithmwhen the first guess of the nodes position is good. However,in case the nodes are severely misplaced at the beginning ofthe MDS algorithm, the availability of internode ranging in-formation makes it possible to correct this deficiencies. Thisis the case of node 2 in Fig. 5. In fact, observing the timeevolution of the state vector Θk during the execution of themapping algorithms (not reported here for space constraints)

we could see that the initial guess for this node position, ob-tained by applying the Particle Filter initialization approach,was close to the position of node 1, which is actually sym-metric with respect to the robot trajectory. With the path fol-lowed by the robot in this experiment, the EKF, PF and MDSalgorithms were not able to recover node 2 from that erro-neous initialization, so that the final localization error waslarge. Conversely, using the internode ranging informationbetween nodes 1 and 2, the MDS Internode algorithm wasable to correct the initial error and enhance the accuracy ofthe final position estimation of node 2.

EKF PF MDS MDSInternode

IN singlech 3.16 m 3.44 m 1.92 m 1.95 mIN multich 1.25 m 1.9 m 1.32 m 0.87 mOUT singlech 2.37 m 1.82 m 1.88 m 1.87 mOUT multich 1.14 m 0.8 m 0.95 m 0.9 m

Table 1 Mean localization errors for indoor (first and second rows) andoutdoor (third and fourth rows) environments, using single-channel andmultichannel ranging.

5.3 Visual recognition experiments

When the robot receives from the mote the MoBIF-basedobject descriptors, it starts looking for this object in the sur-rounding environment. In our experiments, we use severaltype of smart objects, as those depicted in Fig. 1. In allthe experiments we performed, the robot, in addition to cor-rectly localize itself and build a map of the perceived motes,was able to correctly recognize the smart objects in its visualperspective. Fig. 7 exemplifies the result of MoBIF descrip-tors matching (red dots) for a smart object, the blue box, ina cluttered environment. We notice that the matching is cor-rect irrespective of the distance and orientation of the targetobject. In Fig. 8 we show how by using the MoBIF descrip-tors we are able to obtain a correct recognition even in pres-ence of motion blur.

As described in Sec. 4 we also introduced a second cam-era to improve the performance of the visual recognitionmodule. Fig. 9 shows the extracted features: it can be clearlyseen that the matching of the extracted MoBIF features isvery reliable. Table 2 shows the number of extracted featuresfor the example of Fig. 9a, there are features present in bothcameras (shown in green) but some of the object features arepresent only in the left or in the right camera image (shownin blue and red respectively). This shows how the stereo-scopic setup allows to extract and match more features thaneach of the two cameras alone, thus allowing a more reliableobject matching. Furthermore Fig. 9c shows an example of

Page 11: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Autonomous robot exploration in smart environments exploiting wireless sensors and visual features 11

0 2 4 6 8 100

0.5

1

1.5

2

2.5

3

3.5

4

X coordinate [m]

Y c

oord

inat

e [m

]

node 1

node 2

node 3

node 4

node 5

robot pathvirtual beaconssmart objects

(a)

0

1

2

3

4

5

1 2 3 4 5

Loca

lizat

ion

erro

r [m

]

Node ID

EKFPF

MDSMDS internode

(b)

0

1

2

3

4

5

1 2 3 4 5

Loca

lizat

ion

erro

r [m

]

Node ID

EKFPF

MDSMDS internode

(c)

Fig. 5 Indoor scenario: (a) experimental setup, (b) mean estimate er-ror using single-channel RSSI ranging, (c) mean estimate error usingmultichannel RSSI-ranging

0 5 10 150

1

2

3

4

5

6

7

8

X coordinate [m]

Y c

oord

inat

e [m

]

node 1

node 2

node 3

node 4

node 5

robot pathvirtual beaconssmart objects

(a)

0

1

2

3

4

5

1 2 3 4 5

Loca

lizat

ion

erro

r [m

]

Node ID

EKFPF

MDSMDS internode

(b)

0

1

2

3

4

5

1 2 3 4 5

Loca

lizat

ion

erro

r [m

]

Node ID

EKFPF

MDSMDS internode

(c)

Fig. 6 Outdoor scenario: (a) experimental setup, (b) mean estimate er-ror using single-channel RSSI ranging, (c) mean estimate error usingmultichannel RSSI-ranging

Page 12: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

12 Andrea Bardella et al.

Fig. 7 MoBIF descriptor match test in a complex environment andat different distances. The blue box is the object to recognize and inboth images there are only a few incorrect descriptor matches. In theimages the dots have different colors to identify at which cloud distanceMoBIF is associated.

Fig. 8 Example of correct matches in presence of motion blur: above,a zoom of an image grabbed by the robot without motion blur, belowthe image grabbed while moving, thus with motion blur.

occluded object detection. Even if a smaller number of fea-tures is available due to the occlusion, the robustness of theMoBIF descriptors to occlusions and the additional infor-mation provided by the second camera allows to correctlydetect and localize also the partly occluded object.

Following the approach introduced in Sec. 4 the featuresvisible in both cameras can be matched together (as shownin green in Fig. 9) and their respective position can be usedto estimate the object position. The stability of the matchingof the MoBIF descriptors and the RANSAC robust estimatorprovide a reliable distance estimate as can be seen from Ta-ble 3. The table shows the estimate of the distance betweenthe object and the robot in the three example cases of Fig. 9and compares it with a ground truth measure acquired by aTime-Of-Flight sensor. The error in the estimates is just afew centimeters.

a) Box experiment

b) Ball experiment

c) Occluded object (plush) experiment

Fig. 9 Feature matching with a stereoscopic setup: the green dots rep-resent features present in the object and in both cameras while blue andred features are present in the object but only in the left or right camerarespectively. Yellow features do not belong to the object.

Matched descriptorsCamera Obj. and Obj. and Obj. and Not belonging

both im. left im. right im. to the obj.(Green) (Blue) (Red) (Yellow)

Left 26 105 - 982Right 26 - 136 1117

Table 2 Number of matched descriptors for the image of Fig. 9a.

Object Estimate distance Real distance ErrorBox 133.7 cm 140 cm 6.3 cmBall 136.1 cm 140 cm 3.9 cm

Plush 130.3 cm 133 cm 3.3 cm

Table 3 Comparison between the distance estimates and the groundtruth data.

6 Conclusions and future work

In this paper we present a system that enables the autonomousexploration of smart environments by a robot. The applica-tion stems from the capability of the robot to closely interactwith some objects that are enabled to wireless communica-tions and capable of simple computational task and limiteddata storage. The proposed approach allows the robot to pro-gressively acquire environmental awareness by interactingwith the smart objects located in the space. The feasibility

Page 13: Autonomous robot exploration mote in smart environments ...zanella/PAPER/CR_2012/ANTE_CR.pdfE-mail: alberto.pretto@dei.unipd.it Andrea Zanella University of Padova, Dept. of Information

Autonomous robot exploration in smart environments exploiting wireless sensors and visual features 13

of this vision has been proved by means of an experimentalprototype of the system, in which a robot has proved to beable to discover the objects in radio range by using RF com-munication, then to roughly map them into the area throughan RSSI-based localization algorithm coupled with a properinitialization scheme based on particle filters, and finally torecognize the objects in its visual perspective by matchingthe information transmitted from the object with the appear-ance descriptors obtained from the onboard cameras.

This prototype can be further ameliorated in many dif-ferent ways. For instance, we plan to integrate into the SLAMalgorithm the localization information that may be extractedfrom the robot’s camera images. The next step will be tomap not only the smart objects, but also the rest of the envi-ronment, so that the smart objects will be located in a 3D vi-sual map of the environment. In this connection we are alsoplanning to extend the proposed approach in order to obtaina 3D localization of the robots inside a 3D scene representa-tion in order to allow a better interaction between the robotsand complex environments. The final step will be to inte-grate this complete system with a Robotics Brain ComputerInterface we are developing in collaboration with IRCCSSan Camillo of Venice (Italy) and the University of Palermo(Italy). The BCI system will enable to select the smart objectwe want the robot to interact with just by “thinking it”. Thiswill open the possibility to interact with a domotic house oran intelligent ambient also to people with severe disabilities,as ALS - Amyotrophic Lateral Sclerosis.

References

1. J. Lee, H. Hashimoto, Intelligent space - concept and contents,Advanced Robotics 16 (3) (2002) 265–280.

2. J. Kim, Y. Kim, K. Lee, The third generation of robotics: Ubiq-uitous robot, in: Proc of the 2nd Int Conf on Autonomous Robotsand Agents, Palmerston North, New Zealand, 2004.

3. Network robot forum.URL www.scat.or.jp/nrf/English/

4. F. Dressler, Self-organization in autonomous sensor and actuatornetworks, in: Proceedings of the 19th IEEE Int Conf on Architec-ture of Computing Systems, 2006.

5. A. Saffiotti, M. Broxvall, M. Gritti, K. LeBlanc, R. Lundh,J. Rashid, B. Seo, Y. Cho, The peis-ecology project: Visionand results, in: IROS 2008: Intelligent Robots and Systems,North-Holland Publishing Co., Amsterdam, The Netherlands, TheNetherlands, 2008, pp. 2329–2335.

6. G. Zanca, F. Zorzi, A. Zanella, M. Zorzi, Experimental compari-son of rssi-based localization algorithms for indoor wireless sen-sor networks, in: REALWSN ’08: Proceedings of the workshop onReal-world wireless sensor networks, ACM, New York, NY, USA,2008, pp. 1–5.

7. P. Viola, M. Jones, Robust real-time object detection, in: SecondInternational Workshop on Statistical and Computational Theoriesof Vision, 2001.

8. D. G. Lowe, Distinctive image features from scale-invariant key-points, International Journal of Computer Vision 60 (2004) 91–110.

9. J. Sivic, Z. A., Video google: A text retrieval approach to objectmatching in videos, in: Proceedings of the International Confer-ence on Computer Vision, 2003.

10. D. Nister, H. Stewenius, Scalable recognition with a vocabularytree, in: Proceedings of the IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition, 2006.

11. J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman, Object re-trieval with large vocabularies and fast spatial matching, in: Pro-ceedings of the IEEE Computer Society Conference on ComputerVision and Pattern Recognition, 2007.

12. D. Meger, P. Forssen, K. Lai, S. Helmer, S. McCann, T. Southey,M. Baumann, J. Little, D. Lowe, B. Dow, Curious george: An at-tentive semantic robot, in: IROS 2007 Workshop: From sensors tohuman spatial concepts, 2007.

13. A. Pretto, E. Menegatti, E. Pagello, Reliable features matching forhumanoid robots, Humanoid Robots, 2007 7th IEEE-RAS Inter-national Conference on (2007) 532 – 538.

14. E. Menegatti, A. Zanella, S. Zilli, F. Zorzi, E. Pagello, Range-onlyslam with a mobile robot and a wireless sensor networks, Roboticsand Automation, 2009. ICRA ’09. IEEE International Conferenceon (2009) 8 – 14.

15. E. Menegatti, M. Danieletto, M. Mina, A. Pretto, A. Bardella,A. Zanella, P. Zanuttigh, Discovery, localization and recognitionof smart objects by a mobile robot, in: Simulation, Modeling,and Programming for Autonomous Robots, Vol. 6472 of LectureNotes in Computer Science, Springer Berlin / Heidelberg, 2010,pp. 436–448.

16. L. Schenato, F. Fiorentin, Average timesync: A consensus-basedprotocol for time synchronization in wireless sensor networks, in:Proceedings of 1st IFAC Workshop on Estimation and Control ofNetworked Systems (NecSys09), 2009.

17. P. Dutta, D. Culler, Practical asynchronous neighbor discovery andrendezvous for mobile sensing applications, in: SenSys ’08: Pro-ceedings of the 6th ACM conference on Embedded network sensorsystems, ACM, New York, NY, USA, 2008, pp. 71–84.

18. A. Zanella, E. Menegatti, L. Lazzaretto, Self localization of wire-less sensor nodes by means of autonomous mobile robots, in: Pro-ceedings of the 19th Tyrrhenian International Workshop on DigitalCommunications, Ischia (Italy), Sept. 9-12, 2007, 2007.

19. E. Menegatti, M. Danieletto, M. Mina, A. Pretto, A. Bardella,S. Zanconato, P. Zanuttigh, A. Zanella, Autonomous discovery, lo-calization and recognition of smart objects through wsn and imagefeatures, in: IEEE International Workshop Towards SmArt COm-munications and Network technologies applied on AutonomousSystems (SaCoNAS), Miami, USA, 2010.

20. J. A. Costa, N. Patwari, A. O. Hero, III, Distributed weighted-multidimensional scaling for node localization in sensor networks,ACM Trans. Sen. Netw. 2 (2006) 39–64.

21. A. Goldsmith, Wireless Communications, Cambridge UniversityPress, New York, NY, USA, 2005.

22. A. Bardella, N. Bui, A. Zanella, M. Zorzi, An experimental studyon ieee 802.15.4 multichannel transmission to improve rssi-basedservice performance, in: Fourth Workshop on Real-World Wire-less Sensor Networks (REALWSN 2010), Colombo, Sri Lanka,2010.

23. A. Pretto, E. Menegatti, M. Bennewitz, W. Burgard, E. Pagello, Avisual odometry framework robust to motion blur, Robotics andAutomation, 2009. ICRA ’09. IEEE International Conference on(2009) 2250 – 2257.

24. R. I. Hartley, A. Zisserman, Multiple View Geometry in Com-puter Vision, 2nd Edition, Cambridge University Press, ISBN:0521540518, 2004.

25. M. A. Fischler, R. C. Bolles, Random sample consensus: Aparadigm for model fitting with applications to image analysisand automated cartography, Communications of the ACM 24 (6)(1981) 381–395.