a novel dual camera method for odometric exteroceptive...

8
A Novel Dual Camera Method for Odometric Exteroceptive Relative Localisation Munir Zaman and John Illingworth School of Electronics and Physical Sciences University of Surrey Guildford, GU2 7XH. [email protected] Abstract In this paper a new type of sensor data input for mobile robot localisation is proposed. Sensor images are taken from two ground-facing cameras mounted on either side of the robot. The affine transformations between image pairs, computed using the phase correlation method, are used to estimate the motion of the robot. Results on a plain coloured carpeted surface, show that the proposed method provides a truly odometric type sensor data input for mobile robot localisation. Being independent of the kinematics it is resistant to wheel slippage, or other catastrophic failures of kinematic model based methods. A method to calibrate the vision system using a 1D object is also presented. 1. Introduction Wheel odometry is a commonly used sensor input for mobile robot localisation. Wheel odometry provides information on the internal kinematic positions of the robot. These are then transformed to an estimate of the change in the robot pose based upon a kinematic model of the robot. Wheel odometry provides relative localisation as it detects changes in pose relative to its previous pose. The advantage of wheel odometry is that it is high resolution and simple to use. It can typically detect movements of the order of tenths of millimetres. However, it has some known drawbacks. The localisa- tion errors increase without bound, and non-systematic errors, such as wheel slippage, are not detectible. Methods have been published to detect non-systematic errors but these usually involve an additional sensor such as a gyroscope (Borenstein and Feng, 1996), or have complex configurations (Borenstein, 1998). In cases where wheel odometry is unreliable or absent, exterocep- tive sensing methods are used. These sense the environ- ment and therefore do not rely on a kinematic model for estimating relative localisation. For example, in agricul- ture, where wheel slippage is expected, doppler ground sensing radar is used (Hague et al., 2000). Optical mice mounted alongside a mobile robot have been used for relative localisation (Lee, 2004) either on its own, or in combination with wheel odometry (Lee and Song, 2004) and a range sensor (Baek et al., 2005). The optical mice are forced onto the ground to minimise variations in distance between the ground plane and the CCD sensor. This increases friction and reduces traction of the wheels of the robot, increasing the possibility of wheel slippage. Other ground-sensing methods include extracting the orientation of ground tiles to correct for orientation er- rors (Schroeter et al., 2003), but this is limited to sur- faces known to contain such features. An alternative to ground-sensing is based on a single upward-pointing camera which extracts and tracks features on the ceiling in a SLAM framework (Jeong and Lee, 2005). A method called visual odometry has re- ceived recent interest (Nister et al., 2004) and (Campbell et al., 2005). Visual odometry aims to estimate the motion of the robot by extracting and tracking ground point features, to compute a number of optical flow vectors, either through structure-from- motion techniques using a single camera, or from a stereo configuration. The image processing is com- plicated as weak features need to be filtered out and matching needs to be robust. Visual odometry has been used on the Mars Exploration Rovers (MER) in areas of high wheel slippage. However, it is unreliable on planar surfaces devoid of sufficient visual texture (e.g., the Meridiana terrains). The fallback to using wheel odometry, for relative localisation, is based on the premise that there is minimal wheel slippage on planar regions (Cheng et al., 2005). In this paper a new exteroceptive sensor data input for relative localisation, similar in modality and resolution to wheel odometry, is proposed. The method, coined ‘visiodometry’, can be used on planar regions and has the potential to be used on those surfaces which do not contain sufficient visual texture for ’visual odometry’. The rest of the paper is structured as follows. The

Upload: others

Post on 25-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • A Novel Dual Camera Method for

    Odometric Exteroceptive Relative Localisation

    Munir Zaman and John IllingworthSchool of Electronics and Physical Sciences

    University of SurreyGuildford, GU2 [email protected]

    Abstract

    In this paper a new type of sensor data inputfor mobile robot localisation is proposed. Sensorimages are taken from two ground-facing camerasmounted on either side of the robot. The affinetransformations between image pairs, computedusing the phase correlation method, are used toestimate the motion of the robot. Results on aplain coloured carpeted surface, show that theproposed method provides a truly odometric typesensor data input for mobile robot localisation.Being independent of the kinematics it is resistantto wheel slippage, or other catastrophic failuresof kinematic model based methods. A method tocalibrate the vision system using a 1D object isalso presented.

    1. Introduction

    Wheel odometry is a commonly used sensor input formobile robot localisation. Wheel odometry providesinformation on the internal kinematic positions of therobot. These are then transformed to an estimate ofthe change in the robot pose based upon a kinematicmodel of the robot. Wheel odometry provides relativelocalisation as it detects changes in pose relative to itsprevious pose. The advantage of wheel odometry is thatit is high resolution and simple to use. It can typicallydetect movements of the order of tenths of millimetres.However, it has some known drawbacks. The localisa-tion errors increase without bound, and non-systematicerrors, such as wheel slippage, are not detectible.

    Methods have been published to detect non-systematicerrors but these usually involve an additional sensor suchas a gyroscope (Borenstein and Feng, 1996), or havecomplex configurations (Borenstein, 1998). In caseswhere wheel odometry is unreliable or absent, exterocep-tive sensing methods are used. These sense the environ-ment and therefore do not rely on a kinematic model forestimating relative localisation. For example, in agricul-ture, where wheel slippage is expected, doppler ground

    sensing radar is used (Hague et al., 2000).Optical mice mounted alongside a mobile robot

    have been used for relative localisation (Lee, 2004)either on its own, or in combination with wheelodometry (Lee and Song, 2004) and a range sensor(Baek et al., 2005). The optical mice are forced ontothe ground to minimise variations in distance betweenthe ground plane and the CCD sensor. This increasesfriction and reduces traction of the wheels of the robot,increasing the possibility of wheel slippage.

    Other ground-sensing methods include extracting theorientation of ground tiles to correct for orientation er-rors (Schroeter et al., 2003), but this is limited to sur-faces known to contain such features. An alternativeto ground-sensing is based on a single upward-pointingcamera which extracts and tracks features on the ceilingin a SLAM framework (Jeong and Lee, 2005).

    A method called visual odometry has re-ceived recent interest (Nister et al., 2004) and(Campbell et al., 2005). Visual odometry aims toestimate the motion of the robot by extracting andtracking ground point features, to compute a numberof optical flow vectors, either through structure-from-motion techniques using a single camera, or from astereo configuration. The image processing is com-plicated as weak features need to be filtered out andmatching needs to be robust. Visual odometry hasbeen used on the Mars Exploration Rovers (MER) inareas of high wheel slippage. However, it is unreliableon planar surfaces devoid of sufficient visual texture(e.g., the Meridiana terrains). The fallback to usingwheel odometry, for relative localisation, is based on thepremise that there is minimal wheel slippage on planarregions (Cheng et al., 2005).

    In this paper a new exteroceptive sensor data input forrelative localisation, similar in modality and resolutionto wheel odometry, is proposed. The method, coined‘visiodometry’, can be used on planar regions and hasthe potential to be used on those surfaces which do notcontain sufficient visual texture for ’visual odometry’.

    The rest of the paper is structured as follows. The

  • Sony DXC-9100P Cameras

    The two drive wheels

    Sony PTZ camera (not used)

    Ultrasonic sensor ring (not used)

    Figure 1: The visiodometry system is based on a Pioneer DXe

    differential drive robot augmented by two Sony DXC-9100P

    cameras orientated to face the ground.

    proposed method is described in section 2. This is fol-lowed by a calibration method for the vision system androbotic platform in section 3. Experimental results arepresented and discussed, with comparisons with othermethods in section 4. Finally, conclusions and areas forfuture work are mentioned in section 5.

    2. Shift Vector Based Visiodometry

    Implementation. Figure 1 shows a Pioneer DXedifferential drive robot manufactured by ActivmediaRobotics, augmented by two ground-pointing cameras.The cameras are mounted on either side to maximise sep-aration, with the image plane approximately 0.5m abovethe ground. The cameras are Sony DXC-9100P, compact1/2” 3CCD progressive scan RGB cameras. The CCDarray is composed of square pixels 8.3µm x 8.3µm in anarray 792(H) x 582(V) pixels. This is transformed to aPAL format of 720(H) x 576(V) by the frame grabber.As the cameras are fitted with high quality lenses andimages are cropped to a central 256 x 256 pixel area,lens distortion is ignored. Due to limitations of the on-board hardware, all image grabbing and processing isperformed offline.

    The Coordinate Reference Frames. Four 2D coor-dinate reference frames are defined. They are:

    1. World (W). The origin of this frame is an arbitrary2D point fixed in the World on the ground plane.

    2. Robot (R). The origin of this frame is the originof the robot. Typically the kinematic origin, cor-responding to the centre of rotation during a spot-rotational motion, with the x-axis being the forwarddirection.

    3. Scene (S). The origin of this frame is the imagecentre of the camera projected onto the ground (i.e. apoint in the world frame.). There is therefore a sceneframe for each camera.

    4. Visiodometry (V). The visiodometry frame is anarbitrary defined frame of the visiodometry system,which has a known transformation from the Sceneframes of each camera. As there are two scene frames(one for each camera), the visiodometry frame aimsto provide a common coordinate frame for the indi-vidual Scene frames.

    The robot frame relative to the world frame representsthe robots pose. As the cameras are rigidly fixed tothe robot, the scene frame relative to the robot frame isconstant. As all robot motion is planar, on the groundplane, all coordinate systems are 2D. A description of therelationships between, the transformation of the robotand the visiodometry frame, and a point fixed in theWorld frame is described below.

    Notation. The robot pose denoted

    p(x, y, θ) , (x, y, θ, 1)T , (1)

    defines its position and orientation in the world frame,where (x, y) is the position and θ the orientation using aCartesian framework. The matrix Tr2w transforms therobot frame to the world frame

    Tr2w(x, y, θ) ,

    cos θ − sin θ 0 xsin θ cos θ 0 y

    0 0 1 θ0 0 0 1

    . (2)

    There is an equivalence between p and Tr2w

    p = Tr2w p0, (3)

    where p0 , (0, 0, 0, 1)T is the origin or null vector. Therelationship between a transformation of the robot frameto the change in the pose of the robot in the world is:

    p′ = Tr2w Mp0, (4)

    and

    M(∆xr, ∆yr, ∆θr) ,

    cos∆θr − sin∆θr 0 ∆xrsin ∆θr cos∆θr 0 ∆yr

    0 0 1 ∆θr0 0 0 1

    ,

    where p′ is the updated pose, and M the matrix rep-resenting the transformation of the robot frame by(∆xr, ∆yr, ∆θr). The relationship between the changein the visiodometry frame and the new robot pose is

    p′ = Tr2w Tv2r VT−1v2r p0, (5)

    where

    V(∆xv, ∆yv, ∆θv) ,

    cos∆θv − sin∆θv 0 ∆xvsin∆θv cos∆θv 0 ∆yv

    0 0 1 ∆θv0 0 0 1

    ,

  • Tv2r(xv, yv, θv) ,

    cos θv − sin θv 0 xvsin θv cos θv 0 yv

    0 0 1 θv0 0 0 1

    , (6)

    where V represents the change in the visiodometryframe (caused by a change in the robot pose) by(∆xv, ∆yv, ∆θv), and Tv2r the matrix transforming thevisiodometry frame to the robot frame, where (xv, yv)is the position of the visiodometry origin relative to therobot, and θv the orientation of the visiodometry framerelative to the robot frame.

    2.1 The Shift Vectors

    The shift vectors correspond to the translational compo-nent of the two scene frames (see figure 2). The affineimage translations between pairs of images from eachcamera are computed using the normalised phase corre-lation method (Kuglin and Hines, 1975), in pixel units.These are then transformed to known units in the sceneframe. This is described later in section 3. The shift vec-tor data pairs are analogous to wheel odometry data ina differential drive robot. The translation of the sceneframes represented by the shift vectors, is analogous tothe Euclidean displacement of the wheels, and the sepa-ration of the two scene frames is analogous to the wheel-base.

    As the phase correlation method is limited to trans-lation estimates, any rotational component betweenframes will be noise. The rotation between frames issmall, at most ± 1◦, and therefore the pixel shift of theimage centre, projecting to the scene frame origin, isconsidered to approximate to the computed translation.

    2.2 Relative Localisation

    The two cameras take a series of pairs of images of theground, from which pairs of shift vectors, representingthe translation of the scene frame, are estimated. Figure2 is a schematic showing how the shift vectors (v1,v2),estimated from image pairs from each camera, can beused to estimate the change in robot pose. System cal-ibration provides the estimates for the coordinates ofthe origins O1 and O2 relative to the robot frame (OR),the orientation of the scene frames, and the scale fac-tor transforming the pixel shifts between image pairs, toknown units of translation.

    Two shift vectors, estimated from pairs of images fromtwo cameras, are the inputs to the method (see section2.1). These shift vectors provide information on the mo-tion of the robot. The philosophy of the method to esti-mate the robot pose from these shift vectors based on theobservation that the rotational component of the affinetransformation between image pairs is equal to the ro-tational component of the change in pose of the robot,denoted as τ . The visiodometry frame is arbitrarily de-

    Left Scene Right Scene

    KINEMATIC ORIGIN OF ROBOT

    τ

    OR

    O2

    OV, O1

    C

    VO′

    v2

    v1

    Figure 2: The Shift Vectors. The instantaneous motion of

    the robot can be approximated to be a rotation of angle τ

    around an unknown point C. The vectors v1 and v2 are the

    shift vectors estimated from pairs of images. If the coordi-

    nates OR, O1 and O2 are known, the rotation and translation

    of the robot can be computed.

    fined as the left scene frame. Using notation definedearlier, the algorithm is described as follows:

    1. Initialisation. Let p = (x, y, θ, 1)T define the poseof the robot in the world frame. Define Tv2r as thetransformation matrix from the left scene frame tothe robot frame (this is known from system calibra-tion described later). The visiodometry frame there-fore coincides with the left scene frame.

    2. Inputs (v1, v2). The inputs to the method are thetwo shift vectors v1 = (vx1 , vy1),v2 = (vx2 , vy2).These approximate to the translation of the sceneframes of each camera, with reference to the vi-siodometry frame (see figure 2).

    3. Estimate Rotation (τ). A motion of the robotcorresponds to an affine rotation and translation ofthe scene frames. Although the translational com-ponent may be different (i.e., during rotations), anyrotational component must be equal. Assuming arigid transformation, this must also be equal to therotation of the robot, denoted as τ .

    Hence, from figure 2, the rotation of the visiodom-etry frame around C by τ is equivalent to, rotatingthe visiodometry frame by τ first and then trans-lating this rotated visiodometry frame from OV toO′V . Therefore the transformation of the visiodome-try frame corresponds to a rotation of the shift vec-tors v1 and v2 around OV

    O1 −TRO1 = Rv1, (7)O2 −TRO2 = Rv2, (8)

  • and

    R =

    cos τ sin τ 0− sin τ cos τ 0

    0 0 1

    , T =

    1 0 tx0 1 ty0 0 1

    ,

    (9)where R and T are the matrices representing therotation τ and translation (tx, ty) of the visiodometryframe, post-rotation, whose origin is shown as OV .Now substituting R and T into equations 7 and 8,and re-arranging, equations can be expressed in theform of a set of solvable linear equations Ax = bwhere

    A =

    Ox1 + vx1 Oy1 + vy1 1 0Oy1 + vy1 Ox1 + vx1 0 1Ox2 + vx2 Oy2 + vy2 1 0Oy2 + vy2 Ox2 + vx2 0 1

    ,

    x =

    αβtxty

    , b =

    Ox1Oy1Ox2Oy2

    ,

    where α = cos τ and β = sin τ , and O2 = (Ox2 , Oy2),and O1 = (Ox1 , Oy1) , (0, 0). The angle turned is

    τ = arctan(

    β

    α

    ).

    4. Update Robot Pose (p′). The rotation and trans-lation of the robot is estimated with reference to thevisiodometry origin (i.e. the scene origin of the leftcamera). From equation 5, the update equation is

    p′ = Tr2w(x, y, θ)Tv2r(xv, yv, θv) . . .V(∆xv, ∆yv, ∆θv)T−1v2r(xv, yv, θv)p0,

    where (∆xv, ∆yv,∆θv) , (vx1 , vy1 , τ).

    3. System Calibration

    Calibration consists of two distinct parts (i) camera cal-ibration, and (ii) kinematic calibration. Camera calibra-tion estimates the scale factor to transform pixel shiftsbetween pairs of images into known units with refer-ence to the scene frame. It also estimates the homog-raphy between the scene frames, required for estimat-ing rotational motion. Kinematic calibration allows thevisiodometry pose estimates to be consistent with thekinematics of the robot (i.e, the robot frame).

    3.1 Camera Calibration

    The objective of camera calibration is to estimate (i)the homography between the scenes viewed by the twocameras, and (ii) the transformation of image pixelshifts to scaled shift vectors. The key difference be-tween this configuration and other dual or multiple cam-era configurations, is that there is no overlap in the

    y

    x0 yx

    0

    Left Camera Image (IL) Right Camera Image (IR)

    0 mri

    1D Directional Calibration Object

    pi = (xi , yi)

    (a) Camera Calibration

    xS1yS1

    xS2yS2

    yRxR

    Left Scene (C1)

    Right Scene (C2)

    θs

    KINEMATIC ORIGIN OF ROBOT

    s1

    s2

    (b) Kinematic Calibration

    Figure 3: System Calibration. The first stage, camera

    calibration, s to estimate the scene homography between the

    cameras and the scale factor to convert pixel to known units

    in the scene frame. The second stage, kinematic calibration,

    is to estimate the matrix Tv2r transforming the visiodometry

    frame to the robot frame.

    scenes viewed by each camera. It is assumed that thecameras image plane is parallel to the ground plane(i.e., there is no perspective in the scene), and thatall points in the scene are equidistant from the imageplane. A 1D directional line, in our case a measur-ing tape, is the calibration object. Camera calibra-tion methods using 1D objects have been recently pub-lished (Zhang, 2002), (Baker and Aloimonos, 2003) and(Cao and Faroosh, 2004), however, these methods arenot suitable for configurations where there is no sceneoverlap, or do not provide estimates of extrinsic cameraparameters.

    The method relies on a directional 1D calibration ob-ject which cuts across the views of both cameras (seefigure 3(a)). Any straight object with known separationof control points could be used. In our case a metal mea-suring tape laid on the ground is used. Figure 4 showsexample images of the tape cutting across the views ofboth cameras simultaneously, from which the gradua-tions on the tape provide the metric information andthe direction. The coordinates of the markings on thetape (u, v) and the observed value on the tape (r) arethe inputs to the method. An affine camera model isassumed, where the pixel coordinates of a point in theimage, project to a scaled point in the scene frame.

    The method first converts the input coordinates (u, v)to an image-centred coordinate frame (x, y). From prior

  • (a) Left Camera View. (b) Right Camera View.

    Figure 4: Camera Calibration. Example images, from the

    left and right cameras of the robot, used as input images for

    camera calibration. The tape measure is the 1D directional

    line used as the calibration object.

    knowledge of the aspect ratio of the pixels in the cam-era CCD array and the PAL conversion distortion errorby the image capture card, the distortion can be com-pensated for. The converted input positions are thenfitted to a straight line whose equation is of the formy = a + bx. This line represents the straight edge of thetape. The line is then rotated around the image-centreby ψ, so that it lies parallel and is in the same directionas the x-axis. The equation of this line is y = a cos ψ.The scale factor α mapping the image coordinates to thescene coordinates from the observations r of the tape isestimated. The origin of the scene frame is assumed tobe the image centre projected onto the ground plane.

    Using the estimate of the scale factor, and the equa-tion of the rotated line in scene coordinates. The scenecoordinates of a fixed point on the tape, after orientation(X0′, Y ′), can be derived. This could be any fixed pointon the tape (e.g., where the tape reads zero).

    From the values of ψ, X0′ and Y ′ estimated from eachcamera, the scene homography between the two cameraviews can be derived, by taking the difference betweentheir respective parameters

    φ = ∆ψ, Tx = ∆X0, Ty = ∆Y,

    where the homography parameters are a rotation φ, andtranslation (Tx, Ty), which transform a scene coordinateof one camera’s scene frame to the scene frame of theother camera.

    3.2 Kinematic Calibration

    The objective of kinematic calibration is to align therobot frame corresponding to the kinematic origin ofthe robot, with the visiodometry frame, this correspondsto estimating the transformation matrix Tv2r(xv, yv, θv)(see equation 6). Kinematic calibration has no bear-ing on the accuracy of the estimates from visiodometry,but allows the pose estimates from visiodometry to becompared directly with the pose estimates from wheel

    sLsR

    EL

    ER

    xy

    START POSE

    END POSE

    l1

    l2

    l3

    l4

    l5 l6

    θL θRPP’

    Figure 5: Estimating the Ground Truth. The inputs to

    the method are the lengths l1 . . . l6, which define a quadrilat-

    eral with diagonals. Constraints are introduced to ensure the

    lengths form a quadrilateral. The coordinates of the marker

    positions are then computed, from which the relative end

    pose is estimated.

    odometry. An approximating 2-part qualitative method,implemented in our case is described here.

    The first part aims to estimate the variable θv, cor-responding to the orientation between the visiodometryand the robot frame. It is recalled that from cameracalibration, the scene axis are aligned with each other.However, they are not necessarily aligned with the robotframe axis, and therefore require rotating by θv. Thisvalue is estimated by plotting the visiodometry positionplots during a known translational motion of the robot,and adjusting the value of θv until the position plots arein the direction of the robot heading (i.e., along the x-axis). Now, during a spot rotation (i.e., a pure rotationalmotion around the kinematic origin), there is no changein position of the robot origin, however, as the visiodom-etry origin is not coincident with the kinematic origin aspot rotation results in a translation of the visiodome-try origin. Initial values of xv, yv, estimated from roughmeasurements of the positions of the cameras relativeto the midpoint of the wheelbase, are adjusted to min-imise translational motion estimates during known spotrotations.

    4. Experimental Results and Discussion

    In this section the experiments are presented and a dis-cussion of the method in comparison with other extero-ceptive relative localisation methods.

  • (a) Still. (b) At time tk. (c) At time tk+1.

    Run 1

    (d) Still. (e) At time tk. (f) At time tk+1.

    Run 2

    Figure 6: Sample Ground Images. Each image is

    of 256x256 pixels, and represents ≈4cm2 area of a plaincoloured, low-pile, carpeted surface. The images taken when

    the robot is still can be compared with the slightly blurred

    images taken during robot motion.

    4.1 The Ground Truth.

    Although, the continuous ground truth position of therobot during its run is not known, the position plots es-timated from wheel odometry provide a qualitative com-parison sufficient to evaluate the concept of visiodome-try. However, the end ground truth pose is estimated,from marker positions placed at either side of the wheelat the start and end pose (i.e., when the robot is still).Figure 5 is a schematic showing the 6 measurement in-puts l1 . . . l6, of 2 diagonals and 4 edges. The start posemarker positions are (SL, SR) and the corresponding endpose marker positions are (EL, ER). The coordinates ofthe marker positions are computed and then adjustedusing Least Squares methods, to ensure that the lengthsform a quadrilateral. The change in pose can then betrivially computed (assuming that the robot origin liesmidpoint between the markers). The accuracy of theground truth end pose was estimated to be of the orderof ±0.25◦ in orientation, and ±1mm in position.

    4.2 Experiments

    Two runs were conducted. Each run consisted of four1 metre legs, with a 90◦ rotation towards the end ofeach leg. Each run consists of a cumulative distance ofapproximately 4m and rotation of 360◦, describing anapproximately square path. Run-1 was in a counter-clockwise direction consisting of 1745 frames per cameralasting 70 seconds. Run-2 was a slower run in a clock-

    −0.5 0 0.5 1

    0

    0.5

    1

    1.5

    x−coordinate (m)

    y−co

    ordi

    nate

    (m

    )

    Odometry

    Visiodometry

    Ground Truth

    (a) Run 1

    0 0.5 1

    −1

    −0.5

    0

    x−coordinate (m)

    y−co

    ordi

    nate

    (m

    )

    Odometry

    Visiodometry

    Ground Truth

    (b) Run 2

    Figure 7: Ground Plots. A plot of the robot position from

    the proposed method is compared with, position estimates

    from wheel odometry and the end ground truth position.

    wise direction consisting of 2945 frames per camera last-ing 118 seconds. The total angle turned, from groundtruth estimates, was 351.3◦ and 358.0◦ for run 1 andrun 2 respectively. Runs 1 and 2 were taken at differenttimes with different camera configurations (e.g., differentcamera separation).

    The examples of ground images used for run-1 andrun-2, are shown in figure 4.1. A contiguous pair of im-ages taken 40ms apart during rotational motion of therobot, shows the small amount of blurring. An imagepixel corresponds to ≈ 1/10 mm in the scene frame. Thisdefines the resolution of the method, which is compara-ble to wheel odometry.

    Ground Plots. Ground plots comparing visiodometrywith wheel odometry is shown in figure 7. The visiodom-

  • etry plots show a degree of congruence with the positionplots from wheel odometry. It is considered that, de-spite the controlled conditions, the wheel odometry plotis not the ground truth path, and therefore it is possiblethat the visiodometry plot is closer to the true path ofthe robot. Visiodometry pose estimates during knowntranslational motions, show high congruence with wheelodometry estimates, and are quite straight. However,pose estimates during rotational motions differ slightlyfrom wheel odometry estimates.

    Orientation Error. The ground truth end pose pro-vides the true cumulative orientation of the method. Theorientation accuracy is calculated as follows

    percentage heading error =εθΣτ

    x 100 %

    where εθ is the error in heading. The total angle turnedΣτ is computed using the ground truth estimates of thedifference in orientation between the start and end pose,knowing that the robot had rotated approximately 360◦

    in total. The percentage orientation error was 4.1% and2.1% for run 1 and run 2 respectively. This is slightlyless accurate than the estimates from wheel odometry -2.1% and 0.5%.

    4.3 Discussion

    The difference between the plots and orientation errorbetween the runs can be explained by the increase inblurring in the images of run 1, where the robot wastranslating and rotating at approximately twice the rateof run 2.

    Error Analysis. Although the limited results showthat visiodometry is less accurate than wheel odom-etry under non-slip conditions, the sources of errorsare known and there is potential for this to be min-imised. Visiodometry relies on accurately extracting thepixel translations between image frames. The imple-mentation estimated this to pixel level accuracy, how-ever, there are method to estimate to sub-pixel accuracy(Faroosh et al., 2002).

    During rotational motion, there is a rotational com-ponent between image frames. However, the pixel trans-lation estimate computed using the phase correlationmethod is an unknown averaging function of the transla-tion of all points in the image. This may not correspondto the translation of the image centre, which may affectthe accuracy of any change in pose estimates.

    Another source of error is due to PAL conversion dis-tortion error, which performs a shear function along theimage axis to compress a 792(H) x 582(V) CCD arrayimage to a PAL 720(H) x 576(V) pixel image. If there isrotational component between the frames, then not onlyis there an error due to rotation itself, but in addition

    there is an error due to the shear being applied to theaxis local to the image, which will not be parallel to theother image (due to rotation). This can be addressedby corrected for image distortion prior to applying thephase correlation method, but, this has the potential tointroduce interpolation errors which could adversely af-fect the translation estimates form the phase correlationmethod.

    In addition, there may be a errors due to assumptionsmade during calibration. For example, the assumptionthat the image centre corresponds to the optical centre.

    Comparison with ‘visual odometry’. Visualodometry relies on computing numerous optical flowvectors estimated from extracting, tracking and filteringpoint ground features (with the associated data storageand processing requirements). The proposed methodhas been demonstrated to work on ground planarsurfaces, and on a fairly uniformly coloured carpetedsurface. It is considered that visual odometry will fail towork on such surfaces devoid of sufficient visual texturerequired for visual odometry. Due to the resolution ofvisiodometry as implemented here, it is likely that theplanar surfaces on Mars contains sufficient visual texturefor visiodometry to work.

    The key distinction between visual odometry and vi-siodometry is that visual odometry is based on featurematching correspondences, whilst visiodometry is cor-respondenceless. This makes visiodometry more robustand simpler to implement.

    Comparison with wheel odometry. Reliable rel-ative localisation using wheel odometry data relies onan accurate kinematic model and the absence of non-systematic errors such as wheel slip. The wheel odome-try plots used to provide a qualitative comparison withvisiodometry were from runs conducted under ideal con-ditions of minimal or no wheel slip.

    In comparison, visiodometry is exteroceptive, and es-timates motion directly by sensing the environment. Ittherefore has the potential to provide reliable odometrictype data in environments where there is wheel slippageand where there is a catastrophic failure of the kine-matic model. It can also be used to provide odometrictype data where the kinematics is unknown, or wherethe robot does not have wheel odometry.

    Comparison with Optical Mice. The method usingoptical mice has the severe drawback of requiring theoptical mice to be forced down on the ground, causinga significant increase in friction (Lee, 2004). This limitsthe type of ground surface which have low relief, lowabrasion and is not friable. The proposed method doesnot require any physical contact with the ground, andis robust to surface conditions. The authors claim that

  • optical mice is as accurate as wheel odometry. Althoughthe proposed method may not be as accurate, it is non-contact, and this is a significant advantage.

    5. Conclusions

    In this paper a novel method mobile robot localisationhas been presented. The experimental results demon-strate the feasibility of the proposed method for relativelocalisation in planar environments where other exist-ing methods, including wheel odometry, visual odome-try and optical mice, is not suitable. The accuracy andresolution of visiodometry estimates during translationalmotion is similar to wheel odometry. However, the ori-entation error although greater than wheel odometry, isresistant to wheel slip and there are opportunities to re-fine the method to improve accuracy.

    A method to calibrate the visiodometry system usinga 1D directional line was presented. The results validatethe calibration method.

    Future Work. The intention of the implementation ofthe method presented in this paper was to demonstratethe feasibility of the concept. A number of approximat-ing assumptions were made, and refinements which havethe potential to improve accuracy were not implemented.Future work is to analyse the possible sources of errors,especially those that affect orientation accuracy, and toalso evaluate the accuracy and robustness of the methodon different types of surfaces.

    References

    Baek, S., Park, H., and Lee, S. (2005). Mobile robotlocalization based on consecutive range sensor scan-ning and optical flow measurements. In IEEE Con-ference on Advanced Robotics, pages 17–22.

    Baker, P. and Aloimonos, Y. (2003). Calibration ofa multi-camera network. In Conference on Com-puter Vision and Pattern Recognition Workshop,volume 7, page 72.

    Borenstein, J. (1998). Experimental results from inter-nal odometry error correction with the Omnimatemobile robot. In IEEE Transactions Robotics Au-tom, volume 14, pages 963–969.

    Borenstein, J. and Feng, L. (1996). Gyrodometry:A new method for combining data from gyrosand odometry in mobile robots. In IEEE Inter-national Conference on Robotics and Automation,pages 423–428.

    Campbell, J., Sukthankar, R., Nourbaksh, I., andPahwa, A. (2005). A robust visual odometry andprecipice detection system using consumer-grade

    monocular vision. In IEEE International Confer-ence on Robotics and Automation, pages 3421–3427.

    Cao, X. and Faroosh, H. (2004). Camera calibrationwithout metric information using 1D objects. InInternational Conference on Image Processing, vol-ume 2, pages 1349–1352.

    Cheng, Y., Maimone, M., and Matthies, L. (2005).Visual odometry on the Mars Exploration Rovers.In IEEE/RSJ International Conference on Systems,Man and Cybernetics, pages 903–910.

    Faroosh, H., Zerubia, J., , and Berthod, M. (2002).Extension of phase correlation to sub-pixel registra-tion. In IEEE Transactions on Image Processing,volume 11, pages 188–200.

    Hague, T., Marchant, J. A., and Tillett, N. D. (2000).Ground based sensing systems for autonomous agri-cultural vehicles. In Computers and Elect. in Agri-culture, volume 25 of 1-2, pages 11–28.

    Jeong, W. Y. and Lee, K. M. (2005). CV-SLAM: a newceiling vision-based slam technique. In IEEE/RSJInternational Conference on Intelligent Robots andSystems (IROS 2005), pages 3195–3200.

    Kuglin, C. D. and Hines, D. C. (1975). The phase cor-relation image alignment method. In Proceedings ofIEEE Conference on Cybernetics and Society, pages163–165.

    Lee, S. (2004). Mobile robot localization using opticalmice. In IEEE Conference on Robotics, Automationand Mechatronics, pages 1192–1197.

    Lee, S. and Song, J.-B. (2004). Robust mobile robotlocalization using optical flow sensors and encoders.In IEEE Conference on Robotics and Automation,pages 1039–1044.

    Nister, D., Naroditsky, O., and Bergen, J. (2004). Vi-sual odometry. In Computer Vision and PatternRecognition CVPR, volume 1, pages 652–659.

    Schroeter, C., Boehme, H. J., and Gross, H. (2003).Extraction of orientation from floor structure forodometry correction in mobile robotics. In GermanAssociation for Pattern Recognition (DAGM 2003),pages 410–417.

    Zhang, Z. (2002). Camera calibration with one-dimensional object. In Proceedings European Con-ference on Computer Vision, pages 161–174.