rough estimation of interior dimensions using structure...

Rough estimation of interior dimensions using structure from

motion techniques

Sander TiganikInstitute of Computer Science, University of Tartu

Supervisor: Artjom Lind

May 10, 2017

Abstract

This article explains in short the techniques usedto achieve the goals of a thesis of the same title.The thesis tries to create a new scene recostruc-tion algorithm for extracting a rough estimation ofa room from a point cloud using no extra informa-tion. The goal is to achieve a reconstruction thatis structurally complete (no holes) and that mapsthe floor plan of the room with only minor errors.

1 Introduction

”Strucutre from motion is a range imaging tech-nique for estimating three-dimensional structuresfrom two-dimensional image sequences” [1]. Thecommon output for any structure from motiontechnique is a group of data points called a pointcloud. A point cloud, like the name suggests, con-sists of points which have been calculated using theinput images and epipolar geometry. [5] The pointsusually encode at least the position of the point in3-space (x,y,z) and a normal vector (which way thepoint is facing). Up until now, scene reconstruc-tion algorithms either use external information asa supplement to the reconstruction process or usemore or less complex enveloping of the point cloudwith polygons in order to achieve a reconstruction.In this article we show a pipeline that makes an as-sumption about the pointcloud (it is a room) andwithout any additional information tries to recon-

struct the room in a more natural way than a gen-eral enveloping algorithm might.

2 Idea

In order to perform scene reconstruction we willneed a set of steps to get us from a point cloud to apolygon mesh. Next we will outline this algorithmin six steps that will allow us to achieve our goals:

• Extract dominant axes

• Assign points to axes

• Reduce point cloud size

• Assign points to planes

• Reconstruct faces

• Ensure structural completeness

Starting from a point cloud we have a massiveamount of data, from which only very specific datapoints are useful when we assume a room as theoutcome of the algorithm. So as the first thing weneed to find the dominant axes of the scene. In aperfect world the dominant axes of the scene wouldbe X, Y and Z, but since the point cloud can be cal-culated with a certain rotation or translation(roomis tilted) we have to calculate the dominant axes.Next we can assign all points to the axes we justcalculated. If a point normal vector is no more

1

off than 5 degrees from an axis we add it to thataxis. It is worth noting that we differentiate be-tween positive and negative heading of a vector, sopoints with normals (-1,0,0) and (1,0,0) are on twoseparate axes.After the points are assigned to axes we can reducethe amount of data in memory and also cut downon computation time by removing all points thatare not assigned to an axis.Now since we know which points face what way,we have effectively sorted all the points of allflat surfaces pointing along all dominant axes.Next we need to distinguish between different sur-faces(separate walls, objects) and group all pointsof all axes into planes (E.g one single wall or objectside).Once all the planes have been found we can use anyconvex hull algorithm to reconstruct the face thatthe point group of the plane represents. [3] Thisgives us a polygon mesh of non connected poly-gons.In the final step a bounding box for the polygonsis calculated. This will mark the outer limits ofthe room. Once the outer limits are in place wecan start to extend the polygons calculated in thelast step. The idea is to make them rectangularfrom the floor to the ceiling. Then extend them tothe appropriate bounding box wall by following theopposite vector of the plane normal (extending theplane backwards until it hits the box wall).Doing these six steps of the algorithm should yielda satisfactory result for all of the goals that we set.

3 Implementation

The implementation of this algorithm is actuallynot as straight forward as it may seem. Next wewill discuss some of the complexities that one mayencounter trying to implement this algorithm.Already the first step is quite unintuitive. Find-ing the dominant axes of the scene is easier saidthan done. Trying to extract the axes by using asearching algorithm can be quite time consuming.Using the original X, Y and Z axes on the otherhand will most likely yield nothing useful. For-

tunately this problem has already been solved bya research team that created another reconstruc-tion algorithm called Manhattan-world stereo. [2]In the article they used a bin histogram to divideall the points according to their normal vector andthen just get the three most common directionspoints were pointing. They found that in a scenedepicting a structure (walls and right angles), thethree most common directions were usually all un-der a right angle from each other (X, Y and Z forall intents and purposes).Assinging all the points to an axis if possible how-ever is a much simpler task. All one needs to do iscalculate the vector angle between the point nor-mal and the axis vector. If it is smaller than acertain amount (we used 5 degrees) accept it as apoint that belongs to this axis.

Figure 1: Pointcloud after extracting dominantaxes and assigning points. (each color is a differentaxis)

Next came the reducing of point data in order tospeed up future calculations. The complexity ofthis operation is O(n) since we only need to iterateover our input data once and remove all points thathave not been assigned to an axis. In the test pointcloud seen in 1 the amount of points during thisstep decreased from 1.4 million points to a meretwo hundred thousand points.Assigning points to planes is a difficult problem.It has a simple and costly solution or a complexand fast solution.[2] In this article we chose thecostly solution since it gives the added property ofbeing able to distinguish between different planeson the same layer (a corridor and two walls on the

2

Figure 2: Pointcloud after reducing point cloud sizeand assigning points to planes.

same plane on either side). The algorithm itself isa variation of Breadth-First Search where a pointis added to a stack and when it is processed allof its neightbours are processed. In this case its”neighbours” are all other points in the axis thatare closer than some distance. unfortunately thisgives the algorithm a bad time complexity of O(n2)since for each point we have to look over all otherpoints whether they are ”neighbours”. Regardlessof the time complexity this approach works verywell, as seen in 2

Figure 3: Pointcloud after application of gift wrap-ping on the planes.

For creating a polygon from a plane of points anyconvex hull algorithm can be used. [3] For simplic-ity it may be useful to rotate and translate eachplane to the zero point of the scene and flatten itto a 2D object. This allows for the usage of 2Dconvex hull algorithms which are simpler to imple-ment. Later the plane can be rotated and trans-lated back to its original position. In this article

Figure 4: Pointcloud after the completion of thereconstruction pipeline.

the gift wrapping algorithm was chosen as the con-vex hull algorithm of choice. [4] The results of thestep can be seen in 3.Ensuring structural completness works exactly likeit was outlined in 2. To make the process easierit is a good idea to rotate and translate the sceneso that the dominant axes line up with X,Y andZ. This makes the calculation of the bounding boxof the room less complicated. Once the boundingbox has been calculated we can extend the planesinside it until they touch the bouding box. Usingthis method ensures that there can be no structuralerrors in the output polygon mesh. The results ofthis step can be seen in 4

4 Conclusion

In this report we tried to prove that there exists away to create a structurally complete polygon meshusing no other information than what was suppliedin the point cloud. Using the six step algorithm de-scribed above we have achieved just that. There isstill room for improvement, but as the algorithm is,it should be able to generate an adequate polygonmesh of any point cloud of a room given as input. Itis worth noting A too sparse point cloud will stillgenerate a mesh, but as the entropy of the roomincreases so does the potential for errors. Never-theless the algorithm will never generate a fault inthe structural integrity of the mesh. This leads usto conclude that we have achieved the goals thatwe set out to research.

3

References

[1] Structure from motion, address: https:

//en.wikipedia.org/wiki/Structure_

from_motion

[2] Manhattan-world stereo, address: http://

ieeexplore.ieee.org/document/5206867/

[3] Convex hull algorithms, address: https:

//en.wikipedia.org/wiki/Convex_hull_

algorithms

[4] On the identification of the convex hull of afinite set of points in the plane, Jarvis, R. A.(1973).

[5] Multiple View Geometry in computer vision, R.Hartley, A. Zisserman (2004).

4

https://en.wikipedia.org/wiki/Structure_from_motion



http://ieeexplore.ieee.org/document/5206867/

http://ieeexplore.ieee.org/document/5206867/

https://en.wikipedia.org/wiki/Convex_hull_algorithms



rough estimation of interior dimensions using structure...

Documents