efﬁcient surface detection for augmented reality on 3d ......for augmented reality on 3d point...

Efficient Surface Detection for Augmented Reality on 3D Point Clouds

is still lost in this method. On the other hand, region growing is another popular choice. Holz et al. [4] cluster points to polygonal meshing by their surface normal deviation, which requires per-point normal vector calculation. As for curved surfaces, few method has been proposed. Stein et al. [12] propose an efficient approach which classifies convex and concave edges based on simple criteria that operates on local regions. With concave patches representing edges of objects, the divided locally-convex-connected subgraphs represent object parts and their surfaces with high accuracy. However, lacking of the relations between isolated clusters causes over-segmentation on some broad planes like the background, which is essential for AR.

B. Contributions This work makes the following contributions. • Combination of local region growing and top-down

refinements to retain the advantages of both. • Linear time complexity detection differentiating planes and

curvatures in one iteration through the whole cloud, which implies no point is doubly visited.

• Customizable “curved weighting” for different outcomes and thus different applications.

• Applicable for both organized and unorganized point clouds.

2. APPROACH We propose a normal-vector-based agglomerative algorithm to detect curved as well as planar surfaces in 3D point clouds. A flowchart of the proposed system is shown in Figure 2. First, supervoxel segmentation [8] is employed to build a simplified but organized point-based 3D model. Then, certain random

ABSTRACT Surfaces are now where the augmented reality comes true. In this paper, we propose an efficient, learning-free and reliable way to detect for not only planar but also curved surfaces. Furthermore, our approach combines both advantages of local region growing and top-down refinements to retrieve wholesome surfaces in both organized point clouds and prebuilt 3D models. First, we obtain a down-sampled graph by supervoxel segmentation. Afterward, a recursive bottom-up agglomerative hierarchical clustering approach will iteratively merge the supervoxels into surfaces. Finally, top-down refinements on noisy and occluded planes will correct over-segmentations. To sum up, this is a model- and learning-free approach with experimental efficiency — around 80 seeds and 3000 supervoxels are calculated for a 640×480 test point cloud, costing less than 0.2 sec on an ordinary laptop with 2.6GHz Intel core i5, 8GB RAM and without GPU aided.

Keywords Surface detection; Plane segmentation; Augmented reality; 3D point clouds; Agglomerative growing; Region-varying criterion

1. INTRODUCTION In the world of augmented reality (AR), planes have long been the favorable choice due to its detection simplicity on point clouds [9]. In our opinion, however, the inclusions of curved surfaces will bring in a new level of AR-experiences. Therefore, we dedicate to a wholesome surface detection system for a broader interface for AR.

A. Related Works Among several different algorithms for plane extractions on 3D point clouds, RANSAC [11] is widely known and used, which naively generates planar function upon random seeds to fit the original graph. The method results in poor local-region connectivity and is an inefficient trial-and-error approach. To gain efficiency, Oehler et al. [5] first process the cloud with coarse resolution by Hough transformation and connected-component analysis, and then apply RANSAC to refine each of the resulting surfels. Nevertheless, the local-region connectivity

(a) (b) (c) (d)Figure 1. (a) Input point cloud from Kinect Segmentation Dataset [7]. (b) Centroids and adjacency links of supervoxel segmentation. (c) Result of our surface detection system. (d) Example of augmented reality.

Supervoxel Segmenta0on

Non-‐duplicated Random Seeds Agglomera0ve Surface Growing

Refinements and Planar Recombina0on

Point clouds

Augmented Reality System

3D AR model

Figure 2. Flowchart of our algorithm.

Yen-Cheng Kung, Yung-Lin Huang, Shao-Yi Chien

National Taiwan UniversityNo.1, Sec. 4, Roosevelt Road, Taipei, 10617 Taiwan

[email protected], [email protected], [email protected]

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

CGI '16, June 28-July 01, 2016, Heraklion, Greece © 2016 ACM. ISBN 978-1-4503-4123-3/16/06...$15.00DOI: http://dx.doi.org/10.1145/2949035.2949058

seeds are systematically chosen as the starting points to begin the growing algorithm. Afterward, the surfaces grow by including their adjacent supervoxels complying with our region-varying criterion. For the last step, we eliminate noises and combine the over-segmented planar surfaces according to their planar functions. The flow diagram of the algorithm can be seen in Figure 3.

2.1 Supervoxel Segmentation “Supervoxel segmentation” clusters neighboring voxels with alike properties in order to derive an over-segmented cloud. The size of supervoxel is determined by a seed resolution, while the clustering criterion depends on adjustable weightings on color, position, and the normal vector. Within this work, we specify zero for color weighting since surfaces can surely be colorful. In contrast, we enhance the normal vector and position weighting in order to retain structural relations. After this step, the model consists of supervoxels, each of which has its own centroid, normal vector, and adjacency list.

2.2 Non-duplicated Random Seeds With the over-segmented 3D point clouds constructed by supervoxel segmentation, we aim to combine those segments by our growing method. To start the growing, we choose random seeds only among those “unreached” supervoxels, which have not been seeded or included yet. In other words, all supervoxels could only be seeded or included once, which results in “non-duplicated” efficiency.

2.3 Agglomerative Surface Growing Every time a seed is chosen, our growing algorithm along the adjacencies will be recursively called until no adjacent supervoxel fits the criterion. Starting from a seeded supervoxel, Sseed, it includes several adjacent supervoxels, Sadj, complying with our normal criterion C(·,·) :

Noting that Nseed is the normal vector of Sseed and Nadj is the normal vector of Sadj. After certain supervoxels are included successfully, they will form a supervoxel cluster Sclus and continue to include all the adjacencies of the new cluster accordingly, which we still denote as Sadj. Therefore, we can

B

C.1 C.2

D E

Figure 3. Flow diagram of our algorithm. (A) Illustrations of Supervoxel Segmentation with visualized adjacencies. (B) Non-duplicated Random Seeds chosen among unreached supervoxels. (C) Agglomerative Surface Growing, which will recursively include adjacencies until no adjacent supervoxel fits the region-varying criterion. (D) Refinements and Planar Recombinations will reconnect those split planes by their planar functions based on average positions and normal vectors. (E) Labeled result of our algorithm.

A write down the next criterion as:

Noting that equation (1) and (2) differ in that Nclus in (2), not like Nseed in (1), is no longer the normal vector of a single seed supervoxel; instead, it is the combination of numerous normal vectors from Sclus. To be more specific, Nclus is determined by equation (3), which makes Sclus include adjacencies with dissimilar normal vectors:

In this equation, Nedge is the averaged normal vector of supervoxels among Sedge, which consisting of the supervoxels on the edge of Sclus. More precisely, Sedge is the very edge of Sclus that directly connects to Sadj, the newly included supervoxels in the last iteration. On the contrary, Nclus-edge is the normal vector from Sclus but excludes Sedge. The relations between every notation above is shown in Figure 4.

Last but not least, µ stands for an adjustable curved weighting that makes our criterion region-varying. It can decay the weighting of previous criterion and weighs on the newly clustered adjacency. Because of this “updating” feature, we could grow through not only planar but also curved surfaces. Furthermore, curved weighting also acts like a parameter controlling how curved an eligible surface allowed to be. Once there are no more adjacent supervoxels could be included by equation (2), Agglomerative Surface Growing will be terminated with the grown cluster labeled as a surface candidate. Afterward, another iteration will start from another Non-duplicated Random Seeds; hence, recursive call of these two steps will finally go through all the supervoxels and label them as different surface candidates for the next step.

2.4 Refinements and Planar Recombinations Isolated surface candidates with less than three supervoxels are considered to be noises and hence removed. Furthermore,

C(Sseed, Sadj) =

!

true, Nseed ·Nadj > threshold

false, otherwise(1)

C(Sclus, Sadj) =

!

true, Nclus ·Nadj > threshold

false, otherwise(2)

Nclus = µNedge + (1− µ)Nclus−edge (3)

Figure 5. Comparisons between labeled graphs of different steps. (A) Local grown region by Agglomerative Surface Growing. (B) Result after Planar Recombinations, which reconnects the split table, separated wall, etc. (C) Curvature map, lightness stands for curvature, i.e., lighter parts represent surfaces with less curvature.

A B C

Figure 4. Illustration of relations between different supervoxel clusters and their normal vectors.

Oehler et al. [5] √ - 2.00

Holz et al. [4] √ - 0.50

Stein et al. [12] √ √ 0.41

Our Approach √ √ 0.19

Approach Plane Curvature Time (sec)

Table 1. The Runtime of different methods with 2.6GHz Intel core i5, 8GB RAM and without GPU aided.

per frame (640x480 RGB-D from Microsoft Kinect)

neglected in an abundant AR experience. In comparison with our approach, a well-known top-down fitting approach — RANSAC — takes every point in the whole graph into random plane fitting. It produces convincing results on a big plane like the background but yields poor results on relatively small objects for lacking of region informations as shown in Figure 7 (B). On the contrary, region growing method like Local Convexity proposed by Stein et al. [12] performs well on the objects in graph but fails to detect the wholesome

C. Flexibility The adjustable setting of the curved weighting affects the result of our growing algorithm. The higher the curved weighting is, the more flexible the algorithm grows. Note that the runtime of applying different curved weighting is the same. Results of different curved weightings can be seen in Figure 8.

D. Feasibility of Augmented Reality In order to demonstrate the feasibility of AR based on the proposed surface detection system, we carry out some basic experiments using common techniques. Note that because the depth data taken from Microsoft Kinect is often too noisy to calculate an accurate normal vector, we can use neither the normal vectors from supervoxels nor the average normal vectors from the labeled surfaces. Within this experiment, we use the position obtained by our system for translation matrixes, but perform RANSAC separately on our labeled surfaces for accurate normal vectors, indicating the transformation matrixes. After aligning augmented clouds by the translation and transformation matrixes, we could visualize the combinations of the original cloud and augment points to derive examples like A.1 and A.2 in Figure 9. While the method may cause some defects in the details, we can solve the problem by projecting each RGB color of the nearest augmented points to the original points like B.1 and B.2 in Figure 9. However, this replacing method may suffer from bad resolution limited by original cloud. In conclusion, a good strategy is to augment points on clouds with bad resolutions, and project RGB to clouds with higher resolutions and more details to handle.

A B C

D E F

Figure 8. Example results with different curved weightings, µ. (A) Original graph. (B) µ=0.0, almost reduced to planar detection. (C) µ=0.1, fewer patches. (D) µ=0.2, moderate flexibility. (E) µ=0.4. (F) µ=0.5, surfaces of all curvature are connected, thus produces an object-like result.

Figure 7. Comparisons of different approaches from Object Segmentation Database [9]. Note that differences in projection caused some extra black shadows in series (D). (A) Original Graph. (B) RANSAC [11]. (C) Local Convexity [12] (D) The proposed approach can detect broad and detailed surfaces for AR.

A.1 A.2 A.3

B.1 B.2 B.3

C.1 C.2 C.3

D.1 D.2 D.3

the occluded broad planes such as backgrounds tend to be over-segmented due to our region growing strategy. Therefore, planes with similar planar function are recombined, where the similarity is calculated by angle differences between normal vectors and distances between planes. In this step, region grown candidates are refined by top-down perspectives.

3. EXPERIMENTS AND RESULTS

A. Efficiency Although it seems time consuming to iterate through all the supervoxels, in practice, only 60~120 seeds are produced among 3000 supervoxels from 640×480 Kinect graphs, which is much less than the average amount of seeds in RANSAC to find only one maximum plane. Moreover, every supervoxel is visited once, leading to linear time complexity during Agglomerative Surface Growing. The runtime of different algorithms are listed in Table 1. The relations between runtime versus point cloud sizes are shown in Figure 6.

B. Robustness The final step, Planar Recombinations, enriches the preceding growing algorithm with top-down refinements. In contrast to either top-down or bottom-up approaches used in other methods, our approach first performs local-region growing and then refines the results by top-down perspective, avoiding disadvantages of both. Differences between with and without Planar Recombinations can be seen in Figure 5 (A) to (B). Most of all, broad planes such as the background as well as small regions of surface like the cover of books should not be

0"0.1"0.2"0.3"0.4"0.5"0.6"0.7"0.8"

0 0.5 1 1.5 2 2.5 3 3.5 4

Figure 6. The Runtime of proposed algorithm on different sizes of point clouds within linear time complexity. (A) Object Segmentation Dataset [9]. (B) Kinect Segmentation Dataset [7]. (C) Cornell-RGBD-Dataset [1][5]. (D) TUM RGBD Dataset [3][6][13].

(sec)(x106 points)

A

B

C

D D

B. Oehler, J. Stueckler, J. Welle, D. Schulz and S. Behnke. Efficient multi-resolution plane segmentation of 3d point clouds. In Intelligent Robotics and Applications (pp. 145-156). 2011.J. Papon, A. Abramov, M. Schoeler and F. Worgotter. Voxel cloud connectivity segmentation - supervoxels for point clouds. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 2027-2034). 2013. A. Richtsfeld, T. Mörwald, J. Prankl, M. Zillich and M. Vincze. Segmentation of unknown objects in indoor environments. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4791-4796). October, 2012. R. B. Rusu and S. Cousins. 3d is here: Point cloud library (pcl). In 2011 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1-4). May, 2011. R. Schnabel, R. Wahl and R. Klein. Efficient RANSAC for point cloud shape detection. In Computer graphics forum (Vol. 26, No. 2, pp. 214-226). Blackwell Publishing Ltd. June, 2007. S. Stein, M. Schoeler, J. Papon and F. Worgotter. Object partitioning using local convexity. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 304-311). 2014. J. Stuehmer, S. Gumhold, and D. Cremers. Real-time dense geometry from a handheld camera. In Pattern recognition (pp. 11–20). September, 2010.

[8]

[9]

[10]

[13]

[11]

[12]

Figure 10. Example result of prebuilt 3D model from Cornell-RGBD-Dataset [1][5]. (A) Original graph. (B) Labeled point cloud derived by our surface detection. (C) Demonstration of AR.

A

C

B

6. REFERENCES A. Anand, H. S. Koppula, T. Joachims and A. Saxena, Contextually guided semantic labeling and search for three-dimensional point clouds. The International Journal of Robotics Research, 2012. C. Feng, Y. Taguchi and V. R. Kamat. Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. In Robotics and Automation (ICRA), 2014 IEEE International Conference (pp. 6218-6225). May, 2014. A. Handa, T. Whelan, J. McDonald and A. J. Davison. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In Robotics and automation (ICRA), 2014 IEEE international conference (pp. 1524-1531). May, 2014. D. Holz and S. Behnke. Fast range image segmentation and smoothing using approximate surface reconstruction and region growing. In Intelligent Autonomous Systems 12 (pp. 61-73). 2013. H. S. Koppula, A. Anand, T. Joachims and A. Saxena. Semantic labeling of 3d point clouds for indoor scenes. In Advances in neural information processing systems (pp. 244-252). 2011. R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. DTAM: Dense tracking and mapping in real-time. In Computer Vision (ICCV), 2011 IEEE International Conference (pp. 2320 –2327). November 2011.

[1]

[2]

[3]

[4]

[5]

[6]

4. CONCLUSIONS We propose a model- and learning-free approach to achieve multiple surfaces detection within linear time complexity. More specifically, we use a special designed curved weighting formula to achieve region-varying criterion, which enables the algorithm to grow through planar and curved surfaces. Moreover, combining bottom-up local growing method with top-down global recombinations, the proposed method yields a more robust result than others. As shown in Figure 7, all surfaces and sides of the objects in the scene are successfully detected by the proposed approach, while preventing over-segmentation like split table in C.3. Furthermore, our approach can also be applied to a prebuilt 3D world like Figure 10, differentiate surfaces with different curvature like Figure 5(C) and do all features mentioned above as efficiently as shown in Table 1. With all features mentioned above, we construct robust 3D spaces and provide reliable informations for AR efficiently.

5. ACKNOWLEDGMENTS Special thanks to Chun-Han Yao and Po-Jen Lai for their discussions and kind assistances to this paper.

A.1 B.1

Figure 9. Example results using (A) Combinations of augmented points on original cloud. (B) Projection of RGB color of nearest augmented points to each original point.

A.2 B.2

[7]

efﬁcient surface detection for augmented reality on 3d ......for augmented reality on 3d point...

Documents