Off-the-Shelf Vision-Based Mobile Robot Sensing
Zhichao Chen
Advisor: Dr. Stan Birchfield
Clemson University
Vision in Robotics
• A robot has to perceive its surroundings in order to interact with it.
• Vision is promising for several reasons:Non-contact (passive) measurementLow cost Low powerRich capturing ability
Project ObjectivesPath following: Traverse a desired trajectory in both indoor and outdoor environments.1. “Qualitative vision-based mobile robot navigation”, Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2006.2. “Qualitative vision-based path following”, IEEE Transactions on Robotics, 25(3):749-754, June 2009.
Person following: Follow a person in a cluttered indoor environment.“Person Following with a Mobile Robot Using Binocular Feature-Based tracking”, Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2007
Door detection: Build a semantic map of the locations of doors as the robot drives down a corridor.“Visual detection of lintel-occluded doors from a single camera”, IEEE Computer Society Workshop on Visual Localization for Mobile Platforms (in association with CVPR),2008,
Motivation for Path Following
• Goal: Enable mobile robot to follow a desired trajectory in both indoor and outdoor environments
• Applications: courier, delivery, tour guide, scout robots
• Previous approaches:• Image Jacobian [Burschka and Hager 2001]
• Homography [Sagues and Guerrero 2005] • Homography (flat ground plane) [Liang and Pears 2002]
• Man-made environment [Guerrero and Sagues 2001]
• Calibrated camera [Atiya and Hager 1993]
• Stereo cameras [Shimizu and Sato 2000]
• Omni-directional cameras [Adorni et al. 2003]
Our Approach to Path Following
• Key intuition: Vastly overdetermined system(Dozens of feature points, one control decision)
• Key result: Simple control algorithm– Teach / replay approach using sparse feature points – Single, off-the-shelf camera– No calibration for camera or lens– Easy to implement (no homographies or Jacobians)
Preview of Results
mile
sto
ne
imag
eto
p-d
ow
n v
iew
cu
rre
nt
ima
ge
ov
erv
iew
Tracking Feature Points
Kanade-Lucas-Tomasi (KLT) feature tracker• Automatically selects features using eigenvalues of 2x2 gradient
covariance matrix
• Automatically tracks features by minimizing sum of squared differences (SSD) between consecutive image frames
• Augmented with gain and bias to handle lighting changes
• Open-source implementation
WdJI x
dx
dx
2
22
WdJI x
dx
dx
2
)2
(2
[http://www.ces.clemson.edu/~stb/klt]
unknown displacement
gray-level images
W
TZ )()( xgxggradient of image
Teach-Replay
Teaching Phase
start
destination
detect features
trackfeatures
Replay Phase
trackfeatures
comparefeatures
current featuregoal feature
initial feature
goal feature
Qualitative Decision RuleLandmark
image plane
Feature is to the right |uCurrent| > |uGoal| “Turn right”
Feature has changed sides sign(uCurrent) ≠ sign(uGoal) “Turn left”
No evidence“Go straight”
feature
funnel lane
Robot at goal
uGoal
uCurrent
Feature is to the right “Turn right”
Side change “Turn left”
The Funnel Lane at an AngleLandmark
image plane
Robot at goal
feature
α
α α
funnel lane
No evidence“Go straight”
A Simplified Example
“Turn right” “Turn left”“Go straight”
Landmarkfeature
Robot at goal
funnel lanefunnel lanefunnel lanefunnel lane
“Go straight”“Go straight”“Go straight”
The funnel Lane Created by Multiple Feature Points
α
α
Landmark #1
Landmark #2
Landmark #3
Feature is to the right “Turn right”
Side change “Turn left”
No evidence“Do not turn”
Qualitative Control Algorithm
DC
DC
u signu sign
and
uu
Desired heading
Funnel constraints:
uGoal
otherwise 0
uu and 0(u if )u,φ(u,u max uu and 0(u if )u,φ(u,u min
θ DCCDCC
DCCDCC
id
where φ is the signed distance between the uC and uD
Incorporating Odometry
N
1io
idd β)θ(1θ
N
1βθ
Desired heading
Desired heading from odometry
Desired heading from ith feature
point
N is the number of the features;
[0,1]β
Overcoming Practical Difficulties
To deal with rough terrain:Prior to comparison, feature coordinates are warped to compensate for a non-zero roll angle about the optical axis by applying the RANSAC algorithm.
To avoid obstacles:The robot detects and avoids an obstacle by sonar, and theodometry enables the robot to roughly return to the path. Then the robot converges to the path using both odometry and vision.
Experimental Results
Videos available at http://www.ces.clemson.edu/~stb/research/mobile_robot
mile
sto
ne
imag
eto
p-d
ow
n v
iew
cu
rre
nt
ima
ge
ov
erv
iew
Experimental Results
Videos available at http://www.ces.clemson.edu/~stb/research/mobile_robot
mile
sto
ne
imag
eto
p-d
ow
n v
iew
cu
rre
nt
ima
ge
ov
erv
iew
Experimental Results: Rough Terrain
Experimental Results:Avoiding an Obstacle
Experimental Results
Indoor Outdoor
Imaging Source Firewire camera Logitech Pro 4000 webcam
Project ObjectivesPath following: Enable mobile robot to follow a desired trajectory in both indoor and outdoor environments.1. “Qualitative vision-based mobile robot navigation”, Proceedings of theIEEE International Conference on Robotics and Automation (ICRA), 2006.2. “Qualitative vision-based path following”, IEEE Transactions on Robotics, 2009
Person following: Enable a mobile robot to follow a person in a cluttered indoor environment by vision.“Person Following with a Mobile Robot Using Binocular Feature-Based tracking”, Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2007
Door detection: Detect doors as the robot drives down a corridor.“Visual detection of lintel-occluded doors from a single camera”, IEEE Computer Society Workshop on Visual Localization for Mobile Platforms (in association with CVPR),2008
Motivation
• Goal: Enable a mobile robot to follow a person in a cluttered
indoor environment by vision.• Previous approaches:
• Appearance properties: color, edges. [Sidenbladh et al. 1999, Tarokh and Ferrari 2003, Kwon et al. 2005]
Person has different color from background or faces camera. Lighting changes.
• Optical flow. [Piaggio et al 1998, Chivilò et al. 2004]
Drift as the person moves with out-of-plane rotation
• Dense stereo and odometry. [Beymer and Konolige 2001]
difficult to predict the movement of the robot (uneven surfaces, slippage in the wheels).
Our approach
• Algorithm: Sparse stereo based on Lucas-Kanade feature tracking.
• Handles:• Dynamic backgrounds.• Out-of-plane rotation.• Similar disparity between the person and
background.• Similar color between the person and
background.
System overview
Detect 3D features of the scene ( Cont. )
• Features are selected in the left image IL and matched in the right image IR.
Left image Right image
The size of each square indicates the horizontal disparity of the feature.
System overview
Detecting Faces
• The Viola-Jones frontal face detector is applied.• This detector is used both to initialize the system
and to enhance robustness when the person is facing the camera.
Note: The face detector is not necessary in our system.
Overview of Removing Background
2) using the estimated motion of the background.
3) using the estimated motion of the person
1) using the known disparity of the person in the previous image frame.
Discard features for whichwhere is the known disparity of the person in the previous frame,
and is the disparity of a feature at time t .
Remove BackgroundStep 1: Using the known disparity
dtt dd ||~
td~
td
Original features
Foreground features in step 1 Background features
Remove BackgroundStep 2: Using background motion
• Estimate the motion of the background by computing a 4 × 4 affine transformation matrix H between two image frames at times t and t + 1:
(1) 1
f
1
f i1t
it
H
• Random sample consensus (RANSAC) algorithm is used to yield dominant motion.
Foreground features with similar disparity in step 1
Foreground features after step 2
Remove BackgroundStep 3: Using person motion
• Similar to step 2, the motion model of the person is calculated.• The size of the person group should be the biggest.• The centroid of the person group should be proximate to
the previous location of the person.
Foreground features after step 2 Foreground features after step 3
System overview
System overview
Experimental Results
Video
Project ObjectivesPath following: Enable mobile robot to follow a desired trajectory in both indoor and outdoor environments.1. “Qualitative vision-based mobile robot navigation”, Proceedings of theIEEE International Conference on Robotics and Automation (ICRA), 2006.2. “Qualitative vision-based path following”, IEEE Transactions on Robotics, 2009
Person following: Enable a mobile robot to follow a person in a cluttered indoor environment by vision.“Person Following with a Mobile Robot Using Binocular Feature-Based tracking”, Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), 2007
Door detection: Detect doors as the robot drives down a corridor.“Visual detection of lintel-occluded doors from a single camera”, IEEE Computer Society Workshop on Visual Localization for Mobile Platforms (in association with CVPR),2008
Motivation for Door DetectionTopological mapMetric map
Either way, doors are semantically meaningful
landmarks
Previous Approaches to Detecting Doors
fuzzy logic [Munoz-Salinas et al. 2004]
color segmentation [Rous et al. 2005]
neural network[Cicirelli et al 2003]
Limitations: • require different colors for doors and walls• simplified environment (untextured floor, no reflections) • limited viewing angle• high computational load • assume lintel (top part) visible
Range-based approachessonar [Stoeter et al.1995], stereo [Kim et al. 1994], laser [Anguelov et al. 2004]
Vision-based approaches
lintel
post
What is Lintel-Occluded?Lintel-occluded post-and-lintel architecture camera is low to ground cannot point upward b/c obstacles
Our Approach
,)(sign)(N
1nn
xhx n
n
,)(sign)(N
1nn
xhx n
n
Assumptions:• Both door posts are visible• Posts appear nearly vertical• The door is at least a certain width
Key idea: Multiple cues are necessary for robustness (pose, lighting, …)
Pairs of Vertical Lines
1. Edges detected by Canny2. Line segments detected by modified Douglas-Peucker algorithm3. Clean up (merge lines across small gaps, discard short lines)4. Separate vertical and non-vertical lines5. Door candidates given by all the vertical line pairs whose spacing
is within a given range
Canny edges detected lines
vertical lines
non-vertical lines
Homography
In the world In the image
111
'
'
333231
232221
131211
y
x
hhh
hhh
hhh
y
x
Hy
x
(x’, y’) (x, y)
Prior Model Features: Width and Height
3.0
y
x
Principal point
(0,0)(0,0)(0,y) (x, y)
(x,0)
? ?
1 CHH T
An Example
As the door turns, the bottom corner traces an ellipse (projective transformation of circle is ellipse)
But not horizontal
Data Model (Posterior) Features
Image gradient along edges (g1)Placement of top and bottom edges (g2 , g3)
Color (g4)
texture (g5) Kick plate (g6)Vanishing point (g7)
and two more…
Data Model Features (cont.)Intensity along the line
darker (light off)
positive
negative
brighter (light on)
no gap
Bottom gap(g8)
Data Model Features (cont.)
wall walldoor
Slim “U”
intersection line of wall and floor
extension of intersection line
bottom door edge
vertical door lines
ε
Lleft
LRight
floor
Concavity(g9)
The strong classifier
Two Methods to Detect Doors
}1,1{,classifier weak eachfor
decison hard the is where
,0)(sgn)(ψ1
n
n
N
nnn
h
h
xhx
Adaboost
Weights of features
Bayesian formulation
Training images
)d()d|()d( priordata IE
Weights of features
(yields better results)
Bayesian Formulation
)d()d|()|d( pIpIp door image
Taking the log likelihood,
)d()d|()d( priordata IE
)d()d|()|d( pIpIp door image
data
1data )d(),d(
N
iii fI
prior
1prior )d()d(
N
jjjg
Data model Prior model
]1,0[)d(g and (d) where j if
MCMC and DDMCMC• Markov Chain Monte Carlo (MCMC) is used here to maximize
probability to detect door (like random walk through state space of doors)
• Data driven MCMC (DDMCMC) is used to speed up computation doors appear more frequently at the position close to the vertical lines the top of the door is often occluded or a horizontal line closest to the top the bottom of the door is often close to the wall/floor boundary.
Experimental Results: Similar or Different Door/Wall Color
Experimental Results: High Reflection / Textured Floors
Experimental Results: Different Viewpoints
Experimental Results: Cluttered Environments
Results
25 different buildings 600 images:• 100 training• 500 testing
91.1% accuracy with 0.09 FP per image
Speed: 5 fps 1.6GHz (unoptimized)
False Negatives and Positives
distracting reflectionconcavity and bottom gap tests fail
strong reflection
concavity erroneously detected
two vertical lines unavailable
distracting reflection
Navigation in a Corridor• Doors were detected and tracked from frame to frame.• Fasle positives are discarded if doors were not
repeatedly detected.
Conclusion• Path following
• Teach-replay, comparing image coordinates of feature points (no calibration)
• Qualitative decision rule (no Jacobians, homographies)
• Person following• Detects and matches feature points between a stereo pair of
images and between successive images.
• RANSAC-based procedure to estimate the motion of each region
• Does not require the person to wear a different color from the background.
• Door detection• Integrate a variety of features of door
• Adaboost training and DDMCMC.
Future Work• Path following
Incorporating higher-level scene knowledge to enable obstacle avoidance and terrain characterization
Connecting multiple teaching paths in a graph-based framework to enable autonomous navigation between arbitrary points.
• Person following Fusing the information with additional appearance-based
information ( template or edges) . Integration with EM tracking algorithm.
• Door detection Calibrate the camera to enable pose and distance measurements
to facilitate the building of a geometric map. Integrated into a complete navigation system that is able to drive
down a corridor and turn into a specified room