improving the kinect by cross-modal stereo - mpi-inf.mpg.de · improving the kinect by cross-modal...

1
Improving the Kinect by Cross-Modal Stereo Wei-Chen Chiu, Ulf Blanke, Mario Fritz w_ir w_r w_g w_b Stereo Fusion Disparity map Depth RGB IR Pointcloud Cross-modal adaptation IR-like image Stereo Stereo Stereo Fusion Depth RGB IR Multiple Disparity maps Pointcloud Kinect: from gaming interface to robotic perception No sensor is perfect: fuse Kinect depth sensing with cross-modal stereo Early fusion Late fusion Estimating optimal weighted combination of RGB channels to be IR-like for improved stereo matching. Delay combination of different color channels and compute stereo correspondences w.r.t. the IR image independently. Fuse resulting depth estimate with depth sensed by Kinect by union of point clouds. Kinect active sensing: • good for homogenous region • failed on some surfaces specular transparent reflective RGB image IR image Passive stereo vision: • hard for homogenous region • enable to detect disparities at edges of transparent or reflective objects Evalution on object segmentation task from 3D point cloud Kinect only fused Kinect and stereo • Dataset of table top scenario: 106 objects in 19 images • Best result of proposed fusion schemes achieves an average precision of 76.6%. Comparing to 48.8% of built-in Kinect depth estimate, we achieve a significant improvment of nearly 30%. (a)Converted RGB image by optimized weights. (b)IR image (covered projector). (c)RGB image converted to grayscale. (d)Disparity from (a) and (b). (e)Disparity from (c) and (b). Method Result IR projector RGB camera IR camera IR projector + IR camera: 3D depth sensor Problem Motivation

Upload: dinhtuyen

Post on 29-Aug-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Improving the Kinect byCross-Modal Stereo

Wei-Chen Chiu, Ulf Blanke, Mario Fritz

w_ir

w_rw_gw_b

Stereo

Fusion

Disparity map

Depth

RGB IR

PointcloudCross-modal adaptation

IR-like image

Stereo

Stereo

Stereo

Fusion

Depth

RGB IRMultipleDisparity maps

Pointcloud

Kinect: from gaming interface to robotic perception

No sensor is perfect: fuse Kinect depth sensing with cross-modal stereo

Early fusion Late fusion

Estimating optimal weighted combination of RGB channels to be IR-like for improved stereo matching.

Delay combination of different color channels and computestereo correspondences w.r.t. the IR image independently.Fuse resulting depth estimate with depth sensed by Kinect by union of point clouds.

Kinect active sensing:• good for homogenous region• failed on some surfaces ► specular ► transparent ► reflective

RGB image IR image

Passive stereo vision:• hard for homogenous region• enable to detect disparities at edges of transparent or reflective objects

Evalution on object segmentation taskfrom 3D point cloud

Kinect only fused Kinect and stereo

• Dataset of table top scenario: 106 objects in 19 images• Best result of proposed fusion schemes achieves an average precision of 76.6%. Comparing to 48.8% of built-in Kinect depth estimate, we achieve a significant improvment of nearly 30%.

(a)Converted RGB image by optimized weights. (b)IR image(covered projector). (c)RGB image converted to grayscale.(d)Disparity from (a) and (b). (e)Disparity from (c) and (b).

Meth

od

Resu

lt

IR projector

RGB camera

IR camera

IR projector + IR camera: 3D depth sensor

Pro

ble

mM

oti

vati

on