generative modeling of depth observations with gmms for active … · 2019. 6. 25. · generative...

Generative Modeling of Depth Observations withGMMs for Active Perception

Wennie Tabib

I. OVERVIEW

Gaussian Mixture Models (GMMs) are well suited to com-pactly represent sensor observations and model structural cor-relations present in the environment. These generative modelsare advantageous as compared to voxelized representationsthat assume independence between cells and lose dependenciesbetween spatially distinct locations. This abstract surveysrecent works that leverage GMMs for mapping, derivingoccupancy, and estimating pose to enable active perceptiontasks.

II. GAUSSIAN MIXTURE MODEL

A GMM consists of a weighted combination of M Gaussiandistributions. The probability density of the GMM is repre-sented as

p(x|ξ) =M∑

m=1

πmN (x|µm,Λm)

where x ∈ RD, πm is a weight such that 0 ≤ πm ≤ 1,M∑mπm

= 1, and N (x|µ,Λ) is a D-dimensional Gaussian densityfunction with mean µ and covariance matrix Λ.

N (x|µ,Λ) = |Λ|−1/2

(2π)D/2exp

(− 1

2(x− µ)TΛ−1(x− µ)

)The parameters of the distribution are compactly representedas ξ = {πm,µm,Λm}Mm=1. Estimating the parameters of aGMM remains an open area of research [4], but the Expec-tation Maximization is often used where the distributions areinitialized using a clustering approach such as the K-Means++algorithm (see Fig. 1 for example) [1].

III. OCCUPANCY RECONSTRUCTION

Several methods have been developed to derive occupancywith Gaussian Mixture Models. O’Meadhra et al. [6] learn theevidence of occupied and free space to derive the probabilityof occupancy and enable occupancy grid maps to be generatedin local regions and at arbitrary resolutions. To enable onlineuse, Tabib et al. [9] update the probability of occupancy in alocal occupancy grid map maintained around the robot’s poseusing the inverse sensor model [10]. A visualization of theprocess by which occupancy may be reconstructed is shownin Fig. 2.

(a) (b) (c)

(d) (e) (f)

Fig. 1: Overview of the process to create a GMM from a sensor observa-tion. (a) illustrates an image taken from Rapps Cave in Greenbrier, WV. (b)is a corresponding pointcloud colored according to viewing distance (red isfurther away and blue is closer to the camera). (c) Each color correspondsto a cluster created during initialization via the K-Means++ algorithm. Thecluster labels assigned by K-Means++ are used by EM to maximize the loglikelihood of the data given the parameters. The red ellipsoids in (d) and (e)are 1-sigma and 2-sigma representations of the covariance matrices learnedafter running EM, respectively. Because the GMM is a generative model, thedistribution can be sampled from to obtain the model shown in (f).

(a) (b) (c)

Fig. 2: Visualization of method by which occupancy may be reconstructed. (a)A representative example of pointcloud data from a mine environment whereoccupied points within a 15m range are shown in red and points outsidethis range are projected to 15m and shown in blue. (b) illustrates 200-component GMMs created from the occupied (red) and projected free spacepoints (blue). (c) 100 points are sampled from the occupied and free-spaceGMMs, where the number 100 is chosen for illustration purposes, only. Thesepoints are used to update the occupancy values in the occupancy grid shownin grey. The points are projected to the sensor origin and all voxels along thebeam are updated to incorporate the observed free space.

IV. REGISTRATION AND SLAM

Let Gi(x) and Gj(x) denote GMMs trained from sensorobservations Zi and Zj , respectively, and let T (·,θ) denote therigid transformation consisting of a rotation R and translationt. To register Gj(x) into the frame of Gi(x), optimal rotationand translation parameters must be found such that the squaredL2 norm between the distributions Gi(x) and T (Gj(x),θ) isminimized. Tabib et al. [8] develop a closed-form objectivefunction and employ a Riemannian trust-region method with

conjugate gradients [2] to find the optimal rigid transformationparameters. This registration formulation is extended in [7] toincorporate global consistency via a factor graph [5].

V. EXPLORATION

Tabib et al. [9] develop a method for real-time information-theoretic exploration by maintaining a local occupancy gridmap around the robot’s current pose, samples from a GMMmap, and raytraces the sampled points to the sensor origin.Trajectories are represented as forward arc motion primitivesand the Cauchy-Schwarz Quadratic Mutual Information [3]is used to select the action that maximizes the informationgain between the map (represented as an occupancy grid) andsensor observation.

VI. CONCLUSION AND FUTURE WORK

While GMMs have been demonstrated to be useful forcompactly representing sensor observations, reconstructing oc-cupancy, and estimating pose, there are a number of challengesto overcome that could significantly accelerate their adoption.GMMs are computationally expensive to calculate so methodsto accelerate learning these models would be of tremendousbenefit to the robotics community. Other interesting applica-tions are segmentation for the purposes of robotic graspingand manipulation. Multi-robot applications would also benefitfrom sharing these perceptual models across robotic swarms.

REFERENCES

[1] C Bishop. Pattern Recognition and Machine Learning.Springer-Verlag New York, New York, 2 edition, 2007.

[2] Nicolas Boumal, Bamdev Mishra, P-A Absil, andRodolphe Sepulchre. Manopt, a matlab toolbox foroptimization on manifolds. The Journal of MachineLearning Research, 15(1):1455–1459, 2014.

[3] B. Charrow, S. Liu, V. Kumar, and N. Michael.Information-theoretic mapping using cauchy-schwarzquadratic mutual information. In 2015 IEEE Interna-tional Conference on Robotics and Automation (ICRA),pages 4791–4798, May 2015. doi: 10.1109/ICRA.2015.7139865.

[4] R. Hosseini and S. Sra. An alternative to em for gaus-sian mixture models: Batch and stochastic riemannianoptimization. arXiv preprint arXiv:1706.03267, 2017.

[5] Michael Kaess, Hordur Johannsson, Richard Roberts,Viorela Ila, John J Leonard, and Frank Dellaert. isam2:Incremental smoothing and mapping using the bayes tree.The International Journal of Robotics Research, 31(2):216–235, 2012.

[6] C. O’Meadhra, W. Tabib, and N. Michael. Variableresolution occupancy mapping using Gaussian mixturemodels. IEEE Robotics and Automation Letters, page 1,2018. doi: 10.1109/LRA.2018.2889348. Early access.

[7] W. Tabib. Approximate Continuous Belief Distributionsfor Exploration. PhD thesis, Carnegie Mellon University,Pittsburgh, PA, May 2019. URL http://reports-archive.adm.cs.cmu.edu/anon/2019/CMU-CS-19-108.pdf.

[8] W. Tabib, C. OMeadhra, and N. Michael. On-manifoldgmm registration. IEEE Robotics and Automation Let-ters, 3(4):3805–3812, 2018.

[9] W. Tabib, K. Goel, J. Yao, M. Dabhi, C. Boirum, andN. Michael. Real-time information-theoretic explorationwith gaussian mixture model maps. In Proceedings ofRobotics: Science and Systems, Freiburg, Germany, June2019.

[10] S. Thrun, W. Burgard, and D. Fox. ProbabilisticRobotics. The MIT Press, Cambridge, MA, 2005. ISBN0262201623.

http://reports-archive.adm.cs.cmu.edu/anon/2019/CMU-CS-19-108.pdf

http://reports-archive.adm.cs.cmu.edu/anon/2019/CMU-CS-19-108.pdf

generative modeling of depth observations with gmms for active … · 2019. 6. 25. · generative...

Documents