state-of-the-art algorithms for complete 3d model ...engage.miralab.ch/papers/state-of-the-art...

State-of-the-art Algorithms for Complete 3DModel Reconstruction

Georgios Kordelas1, Juan Diego Perez-Moneo Agapito1,2, Jesus M. VegasHernandez2, and Petros Daras1

1 Informatics & Telematics Institute, 1st km Thermi Panorama Road, 57001,Thermi, Thessaloniki,Greece

2 Computer Science Department, University of Valladolid, Valladolid, [email protected],[email protected],[email protected],[email protected]

Abstract. The task of generating fast and accurate 3D models of a scenehas applications in various computer vision fields, including robotics,virtual and augmented reality, and entertainment. So far, the computervision scientific community has provided innovative reconstruction algo-rithms that exploit variant types of equipment. In this paper a surveyof the recent methods that are able to generate complete and dense 3Dmodels is given. The algorithms are classified into categories accordingto the equipment used to acquire the processed data.

Keywords: 3D reconstruction, multi-view, scan registration.

1 Introduction

Nowadays, many advanced applications require three-dimensional (3D) infor-mation. The third dimension plays a decisive role in the analysis of dynamicor static environments. Fields of application in daily life that may exploit thethird dimension include the surveillance and robotic domains that exploit depthinformation to gain a much better analysis of the environment. In the medicalresearch, new technologies require more and more reliable depth data. In general,the domains of 3D image processing, digital photography, games, multimedia,3D visualization and augmented reality make an increasing use of real-time 3Dinformation.

The existing 3D modeling methods can be classified according to the requiredinput data, while their efficacy is reflected by the variety of scene that can be pro-cessed, the fidelity of the final model and the total processing time. Accordingto the user requirements, automated, semi-automated or manual image-basedapproaches can be selected to produce digital models usable for inspections,visualization or documentation. Automated methods focus mainly on the fullautomation of the process but generally produce results which are mainly goodfor nice-looking real-time 3D recording or simple visualization. On the otherhand, semi-automated methods try to reach a balance between accuracy andautomation and are very useful for precise documentation and restoration plan-ning.

2 Lecture Notes in Computer Science

The aim of this paper is to identify the most recent and advanced methodsand present them according to the equipment they exploit. The rest of this paperis organized as follows. In Section 2 algorithms using the range scanner technol-ogy are presented. Section 3 covers multi-view stereo approaches. In Section 4an overview of the 3D reconstruction lab established for the scope of the 3DLifeproject [54] is presented, while conclusions are drawn in Section 5.

2 Reconstruction Using Laser Scanner Technology

In the last years, laser scanner technology was emerged as a useful and com-petitive approach for creating 3D reconstructions. The basic advantages of themethods that use this technology are: (i) speed, (ii) accuracy and (iii) resolutionof the reconstruction. Moreover, the scanners’ field of view allows for the recon-struction of objects, which size ranges from a few centimeters to several metersand exist in a short or long distance. Consequently, this technology is suitablefor large-size scenes, as the interior and exterior of buildings and therefore, itis generally accepted by the community as a valid support for documentationand conservation of historic buildings, monuments or archaeological sites. Epi-grammatically, the reconstruction of a scene using a range scanner requires thefollowing steps:

1. Acquisition of an appropriate number of colour range scans so as to ade-quately cover the 3D scene.

2. Registertation of the range scans in the same coordinate system (Fig. 1(I)).3. Data processing for refinement of the final 3D surface model (Fig. 1(II)).

The processing stage includes the following steps:– Elimination of redundant information (i.e. double data created from reg-

istration of overlapping regions) from the registered point cloud and noiseremoval.

– Construction of a 3D model that comprises polygonal facets from thepoint cloud and its complete missing data (hole filling).

The placement of the equipment, in order to acquire the range scans is atrivial task, while there are many commercial software packages that are able toprocess data so as to refine the final 3D model [52, 53]. Therefore, the main prob-lem during the 3D reconstruction procedure lies in the automatic computationof the three-dimensional transformations that align the range data sets in orderto extract the complete 3D model. Registering point clouds in the same coordi-nate system presumes the exploitation of a 3D-to-3D registration method. Thereare plenty of methods dealing with range data registration [1–9, 13–15, 18–23].Some methods are automatic and rely on an automated matching procedure offeatures [4–9, 13–15, 18, 23], while others require the use of markers [1–3]. Thisreview presents the most efficient procedures that can perform accurate registra-tion without the need of special markers. The range scan registration procedurecan be divided into two steps: (1) initial registration that provides a good initialguess of the alignment transformation and (2) fine registration that gives theaccurate alignment transformation.

Lecture Notes in Computer Science 3

Fig. 1. Registration of: (I) two range scans (II) multiple range scans forming a 3Dmodel [4].

2.1 Initial Registration

A wide variety of methods have already been proposed for initial registrationof scans. Several methods extract geometric features from the scans to allowfeature matching and alignment between scans. In [4], an effective system thatintegrates automated 3D-to-3D registration based on geometric features is pre-sented. During preprocessing a set of major 3D planes, a set of geometric 3Dlines and a set of reflectance 3D lines from each 3D range scan are extracted.The range scans are registered in the same coordinate system via the automated3D-to-3D feature-based range-scan registration method of [5]. As a result, allrange scans are registered with respect to one selected pivot scan. Since thereare some cases in which the extracted linear features are inadequate to registerthe range scans together, the authors use nonlinear features such as circular arcs.Then, a circle-based registration method based on similarity of radii, orientationand relative position between pairs of circles is utilized in the matching phase.From each valid matching circle pairs, a candidate transformation is computedand its correctness is evaluated. Finally, the transformation achieving the small-est average distance between overlapping range points is chosen as the best one.The mean registration error for this method is about 1 cm, while it can registerefficiently scans with minimum overlap of about 20%. The work presented in[6] proposes an angular-invariant feature for the 3D registration procedure toperform reliable selection of point correspondence. The angular feature, which isinvariant to scale and rotation transformations, improves the convergence and er-ror without any assumptions about the initial transformation. A major criticismagainst this feature, however, is that it can discover the potential structural in-formation hiding in nearly flat surfaces. The work presented in [4] can efficientlyperform automatic range scan registration using simple geometric features (3D


lines, circles). Therefore, this algorithm is appropriate when intensity data doesnot accompany the range data.

A class of registration comprises methods that extract descriptors of largescan areas as the basis for registration and 3D object recognition. An earlymethod for free-form surface registration is proposed in [13]. This approach usesthe spin image surface representation, which has low discriminating capabilitybecause it maps the 3D range image into a 2D histogram. Therefore, the spinimage matching procedure results in many ambiguous correspondences whichmust be processed through a number of filtration stages to prune out incorrectones making the technique computationally inefficient even for range images of areasonable size. A novel 3D free-form surface area representation scheme basedon third order tensors, is used for surface registration in [14]. More, specificallymultiple tensors are used to represent each range scan. Tensors of two rangescans are matched to identify correspondences between them. Correspondencesare verified and then used for pairwise registration of the range images. Theexperimental results show that this algorithm is robust to resolution, number oftensors per view, the required amount of overlap and noise. Comparison against[13] proved its registration efficiency. In [15], the scan alignment is based onthe correlation of two Extended Gaussian Images (EGIs) [16] in the Fourierdomain using the spherical harmonics of the EGI and the rotational Fouriertransform [17]. For pairs with low overlap, which fail to satisfy two criteria (thefirst one is based on the consistency of surface orientations in the overlappingregion and the second one on visibility information), the rotational alignment canbe obtained by the alignment of constellation images generated from the EGIs.Rotationally aligned sets are matched by correlation using the Fourier transformof volumetric functions. The merit of this method is that it can efficiently alignpoint clouds with arbitrarily large displacements that have very little overlap.The major advantage of [13–15] methods is that they can resister free-formobjects contrarily to [4, 6, 5], which use simple geometric features.

When calibrated intensity images accompany the range scans, intensity fea-tures can be combined with range information to develop efficient registrationmethods. In the method developed by Wyngaerd and Van Goon [18], the 3Dmeasurements are combined with the texture. In particular, the surface is in-tersected with small spheres centered at feature points. The intersection linebetween such a sampling sphere and the surface defines an invariant region inthe texture image. The surface texture inside these regions is used for matchingregions between different patches. The merit of this approach is that it can beapplied in parts of the surface with poor geometry and rich texture or with richgeometry and poor texture. Bendels et al.[9], match 2D SIFT features [11], back-project them onto range data, and then employ RANSAC [10] on these pointsin 3D to identify an initial 3D transformation. A similar registration system ispresented in [7]. During initialization, intensity keypoints and their SIFT de-scriptors are extracted from the images and backprojected onto the range data.A 3D coordinate system is established for each keypoint using its backprojectedintensity gradient direction and its locally computed range surface normal. Key-


points are then matched using their SIFT descriptors and each match providesan initial rigid transformation estimation. An extension of the Dual-BootstrapICP algorithm [12] is used for the fine alignment of the scans. The weakest pointof [7] is initialization, since an improper initial estimation would cause alignmentfailure. Seo et al.[8] use a method to correct geometric and illumination varia-tions of the photometric features before performing 2D local feature matchingusing the SIFT algorithm. Authors claim that this algorithm is faster than mostmethods that rely on shape information. Methods that exploit intensity infor-mation [18, 9, 7, 8] perform well in aligning textured range scans. However, lackof texture would make these approaches useless.

2.2 Fine Registration

Initial registration provides a good estimation of the alignment transformationbetween sets of 3D range scans, while fine registration, which follows initial reg-istrations, aims to optimally align these sets. In the literature, the ICP (IterativeClosest Point) algorithm [19] is a very popular method for the fine registration of3D data sets, when an initial guess of the relative pose between them is known.The work presented in [20] classifies several ICP variants and evaluates their per-formance according to time required to reach the correct alignment. Moreover,a combination of ICP variants optimized for high speed is proposed. The studyon convergence properties of ICP [21] shows that the use of normal distancesis more effective than Euclidean distances and proposes faster convergence withhigher order approximations. An alternative approach that combines intensityand geometric attributes to filter closest point matches is studied in [22].

A maximum-likelihood method for registration of scenes with unmatched ormissing data, which does not require ICP refinement, is presented in [23]. In thismethod, correspondences are formed between valid and missing points in eachview. The matched points are classified according to their visibility propertiesinto types. Then a generic sensor model, which can be adjusted to match a widevariety of sensors, is used to generate likelihood measures for each point type.Finally, a multistage optimization procedure, based on these likelihood mea-sures, takes place to find a maximum-likelihood registration. The experimentalresults proved the efficacy of this method to register complex noisy scenes withocclusions, missing data, and varying degrees of overlap.

Even if range scanners give promising results, their cost, size, power require-ment and the intricate handling of their data are significant drawbacks. There-fore, the availability of methods that use range scanned technology is limitedwhen compared to multi-view stereo methods.

3 Multi-view Stereo Reconstruction

In this section a survey on multi-view stereo reconstruction methods that areefficient to provide dense and full 3D reconstructions of objects from multipleviews, is given. Thus, binocular, trinocular, and multi-baseline methods that


Fig. 2. (I) Intersection of three visual cones [24] and (II) Evolving shape after erodinginconsistent points [29].

reconstruct a single dense map or structure-from-motion are not considered inthis survey.

There are several types of methods that are used to reconstruct 3D models ofobjects from a set of images. These methods could be classified into: (1) methodsthat reconstruct the visual hull of the object, (2) approaches that recover thephoto-hull of an object and (3) algorithms that minimize the surface integral ofa certain cost function over the surface shape.

3.1 Visual Hull Reconstruction

The first class includes methods that exploit silhouette information to generateintersected visual cones (Fig. 2(I)), which then are used to obtain the 3D rep-resentation of an object. Silhouette-based methods are popular for use in multi-camera environments mainly due to their simplicity and computational efficiency.In [49], a parallel pipeline processing method for reconstructing a dynamic 3-Dobject shape from multiview video images is proposed. Real-time processing isaccomplished through the combination of a plane-based volume intersection al-gorithm with a parallel pipeline implementation. The quantitative performanceevaluations demonstrated that the acceleration and parallelizing algorithms arevery efficient in reconstructing a dynamic full 3-D shape in good resolution. Anovel framework for multi-view silhouette cue fusion is proposed in [25]. Thisframework uses a space occupancy grid as a probabilistic 3D representation ofscene contents. The idea behind this paper is to consider each camera pixel asa statistical occupancy sensor. All pixel observations are then used jointly toinfer where, and how likely, matter is present in the scene. Through this pa-per optimal scene object localization, and robust volume reconstruction, can beachieved, with no constraint on camera placement and object visibility. An algo-rithm that eliminates the problems related to dense feature point matching andcamera calibration is presented in [28]. This method is based on the projectivegeometry between the object space and silhouette images taken from multiple


viewpoints. The object shape is reconstructed by establishing a set of hypothet-ical planes slicing the object volume and estimating the projective geometricrelations between the images. Ishikawa et al.[26] include a genuine segmentationmethod to acquire the silhouettes, but segmentation errors directly affect thevisual hull since the segmentation process is absolutely independent to the re-construction process. Grauman et al. [27] proposed a Bayesian approach to com-pensate for modeling errors from false segmentation. They modeled prior densityusing probabilistic principal components analysis and estimated a maximum-a-posteriori reconstruction of multi-view contours. This approach reconstructsgood error-compensated models from erroneous silhouette information, but itneeds prior knowledge about the objects and ground-truth training data. Con-cluding, shape-from-silhouette approaches can generate full 3D reconstruction ofdynamic scenes, but they lack in reconstruction fidelity and are very sensitive toerrors in silhouette extraction. Therefore more efficient techniques, in terms ofreconstruction quality, are employed to generate 3D models with increased levelof detail.

3.2 Space Carving Reconstruction

The second class includes the space carving approaches, which take into accountthe photometric consistency of the surface across the input images and allowfor the recovery of the photo-hull that contains all possible photo-consistentreconstructions. Space carving methods generate an initial reconstruction thatenvelops the object to be reconstructed. The surface of the reconstruction isthen eroded at the points that are inconsistent with the input images (Fig.2(II)). By repeating this process a reconstruction, which is consistent with theinput images, emerges. The Space Carving algorithm suggested by Kutulakosand Seitz [29] uses a repeatedly sweeping plane through the scene volume andtests the photo-consistency of voxels on that plane. This approach permits ar-bitrary camera placement. However, it has the drawback of making hard andirreversible commitments on the removal of voxels. In particular, if a voxel isremoved by error, further voxels can be erroneously removed in a cascade effect.This may lead to incorrect reconstruction by creating holes. Thus, space carvingrecasts are proposed in the literature. A space carving probabilistic frameworkfor analyzing the 3D occupancy computation problem from multiple images isintroduced in [30]. This framework enables a complete analysis of the complexprobabilistic dependencies inherent in occupancy calculations and provides anexpression for the tightest occupancy bound recoverable from noisy images. In[31], two major extensions to the Space Carving framework are presented. Thefirst one is a progressive scheme for better reconstruction of surfaces lackingsufficient textures. The second one is a novel photo-consistency measure that isvalid for both specular and diffuse (Lambertian) surfaces, without the need oflight calibration. This method, unlike [29, 30], can deal with surfaces lacking suf-ficient textures. Concluding, the Space Carving framework suffers from severalimportant limitations:


– The original Space Carving approach [29] makes hard decisions. This limi-tation is partially overcome in [30].

– The choice of the global threshold on the color variance is often problematic[29, 30]. An attempt to alleviate these photometric constraints in presentedin [31]

– The voxel-based representation, used in the Space Carving approaches, dis-regards the continuity of shape makes it very hard to enforce any kind ofspatial coherence. As a result, space carving is sensitive to noise and outliersand may yield to noisy reconstructions.

3.3 Reconstruction via Surface Integral Minimization

The third class of methods optimizes the surface integral of a consistency func-tion over the surface shape. Level-set based methods provide a way of minimizingthis cost function. In [43], surface reconstruction is achieved by combining both3D data and 2D image information. This leads to a more robust approach thanexisting methods that use only pure 2D information or 3D stereo data. For theefficient evolution of surfaces, a bounded regularization method based on level-set methods, is proposed. Additionally, if the silhouette information from the2D images is available, it can be integrated to improve the pure 3D results.The main limitation of this system is due to the choice of the surface evolutionapproach, which assumes a closed and smooth surface. Therefore, the surfacereconstruction module is not designed for outdoor or polyhedric objects. Thealgorithm presented in [37], starts with a generic surface, say a large sphereor a smooth cube, and evolves it to best approximate the shape of the scene.This task is performed by numerically integrating systems of partial differentialequations using the level set method presented in [41]. In order to deal withnon-Lambertial surfaces this method uses a model of the radiance that accountsfor deviations from Lambertian reflection through an affine subspace constrainton the radiance tensor field. This algorithm does not require strong texture andcan handle sharp radiance changes. A novel method for multi-view stereovisionthat minimizes the prediction error using a global image-based matching score,is presented in [40]. The input image views are wrapped and registered witha user-defined image similarity measure, which can include neighborhood andglobal intensity information. The surface evolution is implemented in a level setframework [41]. Experiments proved the superiority of this method against themethod presented in [37], even for complex non-Lambertian images includingspecularities and translucency.

Except for level-set based methods, a second way of minimizing the surfaceintegral is to use graph-cuts. Yu et al.[38] propose a new iterative graph-cutsbased algorithm which operates on the Surface Distance Grid, to reduce theminimal surface bias and transform the discretization bias into a controllabledegree of surface smoothness (these biases make difficult the recovery of surfaceextrusions and other details). This algorithm works better than [37] in preservingthe edges and corners, which results in a lower volume difference. The drawbackof this method is that it is assumed that the initial estimate is already quite close


to the final result. In [39], a direct surface reconstruction approach is proposed,which starts from a continuous geometric functional that is minimized up toa discretization by a global graph-cut algorithm operating on a 3D embeddedgraph. The whole procedure is consistently incorporated into a voxel represen-tation that handles both occlusions and discontinuities. In [32], an algorithm forreducing the minimal surface bias associated with volumetric graph cuts for 3Dreconstruction from multiple calibrated images is presented. The algorithm isbased on an iterative graph-cut over narrow bands combined with an accuratesurface normal estimation. At each iteration, the normal to each surface patchis optimized in order to obtain a precise value for the photometric consistencymeasure. Then, a volumetric graph-cut is applied on a narrow band around thecurrent surface estimate to determine the optimal surface inside this band. Thereconstruction results, obtained on standard data sets, are more accurate andcomplete than in [45] and [33] ([47] presents the evolution of this work and isdescribed below). Additionally, this method does not require exact silhouetteimages (unlike [45]) or the use of a ballooning term (unlike [33]). The octahedralgraph structure used in [34] establishes a well defined relationship between thephoto-consistency of a voxel and the edge weights of an embedded octahedralsubgraph. This specific graph design supports a hierarchical surface extraction,which allows to efficiently process even high volumetric resolutions and a largenumber of input images. This method achieves high resolution in the region ofinterest, but it relies heavily on the visual hull being a good approximation ofthe surface.

3.4 Fusion of Reconstruction Techniques

Many recent approaches use a fusion of different reconstruction techniques to ac-complish better reconstruction results. The flexibility of the carving approach iscombined with the accuracy of graph-cut optimization in [35]. In this algorithma progressive refinement scheme is used to recover the topology and reason thevisibility of the object. Within each voxel, a detailed surface patch is optimallyreconstructed using a graph-cut method. The advantage of this technique is itsability to handle complex shape similarly to level sets while enjoying a higherprecision. Compared to carving techniques the produced surface does not sufferfrom aliasing. This work is extended in [36], where a new surface representa-tion method, called patchwork is introduced. A patchwork is the combination ofseveral patches that are built one by one. This design potentially allows for thereconstruction of an object with arbitrarily large dimensions while preserving afine level of detail. This algorithm outperforms the level-set method presented in[42] and Space Carving [29]. The use of graph-cut optimization to the volumet-ric multiview stereo problem is introduced in [47], too. Initially, it is defined anocclusion-robust photo-consistency metric, which is then approximated by a dis-crete flow graph. This metric uses a robust voting scheme that treats pixels fromoccluded cameras as outliers. Graph-cut optimization can exactly compute theminimal surface that encloses the largest possible volume, where surface areais just a surface integral in this photo-consistency field. However, the ballooning


Fig. 3. Provided interface for (I) displaying the frames captured by the cameras inreal-time and (II) calibrating the camera network.

term used in this method cannot handle thin structures or big concavities. Kolevet al.[44] consider three different energy models for multiview reconstruction,which are based on a common variational template unifying regional volumet-ric terms and on-surface photoconsistency. While the first two approaches arebased on a classical silhouette-based volume subdivision, the third one relieson stereo information to introduce the concept of propagated photoconsistency,thereby addressing some of the shortcomings of classical methodologies. Quali-tative and quantitative experiments showed that precise and spatially consistentreconstructions can be computed by minimizing continuous convex functionals.In [45], a graph cut algorithm manages to recover the 3D shape of an objectusing both silhouette and foreground color information. Initially, a method thatis able to deal with silhouette uncertainties arising from background subtractionis used to extract the visual hull of the object. Then, the graph cut algorithm isused for optimization on a color consistency field. Constraints that are added toimprove its performance are efficient enough to preserve protrusions and to pur-sue concavities on the surface of the object. Sinha et al.[46], recover surfaces athigh resolution by performing a graph-cut on the dual of an adaptive volumetricmesh created by photo-consistency driven subdivision. This methods does notrequire good initializations and is not restricted to a specific surface topology.The specific graph-cut formulation enforces silhouette constraints to counter thebias for minimal surfaces. Local shape refinement via surface deformation is usedto recover details in the reconstructed surface. However, the solutions that in-corporate silhouette constraints [45, 46] are only viable when exact silhouettesare available.

4 The 3D Reconstruction Lab of 3DLife

The 3DLife [54] project aims to develop technologies that could make interactionbetween humans in virtual online environments easier, more reliable and more


realistic. In order to achieve this goal, the integration of recent progress in 3Ddata acquisition and processing, autonomous avatars, real-time rendering, inter-action in virtual worlds and networking are required. Therefore, scenarios thatwill enable exploitation of recent progress have been developed. A key elementin these scenarios is to collect 3D data in real data from moving people. On thisscope a 3D reconstruction lab was established.

The lab consists of six CCD cameras (Fig. 4(IV)), which are mounted on acylindrical grid. Each camera is switched to a computer, which are connected toa net switch forming a star network. There is an additional computer used asserver, where the Network Time Protocol is installed. This protocol is used tosynchronize the clocks of the computers with the server, so that all the computersshare the same time.

Server’s Graphical User Interface (GUI) encompasses the tools used for moni-toring and performing the 3D reconstruction process. In particular, video framesfrom all cameras are depicted in real time through the interface (Fig. 3(I)). Thus,cameras’ positions and orientations can be readily adjusted to gain better vis-ibility of the scene. The calibration of the camera network is performed viaZhang’s algorithm [48], which uses as calibration object a chessboard pattern.The GUI (Fig. 3(II)) allows the user to define the calibration pattern settings(the number and the size of the squares along width and size), while calibrationcan be performed either automatically or manually. The calibration outcome arethe extrinsic and the intrinsic parameters, which are essential to fuse video dataacross the cameras. The volumetric intersection approach presented in [49] isthe basis of computing the visual hull of the 3D object through multiple videoinputs. The process of generating a 3D frame in this framework is described bythe following steps:

1. Synchronized multiple video capturing: A set of multiple images are capturedsimultaneously by the camera network (Fig. 4(I)).

2. Silhouette Extraction: A silhouette is extracted per frame by exploiting abackground subtraction algorithm [51] (Fig. 4(II)).

3. Silhouette Volume Intersection: A visual cone, encasing the 3D object, isgenerated per silhouette and cones are intersected with each other to generatethe visual hull of the object (voxel representation).

4. Surface Shape Representation: A marching cubes method [50] is applied toconvert the voxel representation to a polygonal representation (the outcomeof 3,4 is visualized in Fig. 4(III)).

5. Texture Mapping: Color and texture are projected on the generated 3Dshape.

The reconstruction results for a time frame are depicted in Fig. 5. More specif-ically, on the left image the 3D object is depicted in a model simulation of thereconstruction lab (the spheres simulate the position of the cameras), while onthe right image a closer view of the reconstructed object is provided. The mainadvantages of this method are the following:

– It allows real-time dynamic full 3D shape reconstruction.


Fig. 4. (I) Video Capture, (II) Silhouette Extraction, (III) 3D Shape Reconstructionand (IV) 3D Reconstruction Lab Overview.

– It provides good 3D reconstruction results even for poor quality video, as notexture correspondences are needed.

Thus, this algorithm fits to the scope of this lab, which is to collect 3D datain real data from moving people. Its main drawback lies on the fact that not highlevel of detail can be obtained. Other 3D reconstruction methods (Sections: 3.2,3.3) could provide better reconstruction fidelity, but the time required for thedata process is far from being real time. As a sequence, they are not appropriatefor the specific application.

The ongoing research towards improving the 3D reconstruction algorithm, in-cludes: (a) the employment of a more efficient background subtraction algorithmthan [51], which will be more robust to camera sensor noise, ambiguities betweenobjects and background colors, changes in the lighting of the scene (includingshadows of objects of interest) and (b) fusion of the existing method with novelapproaches that will improve the reconstruction fidelity without adding signifi-cantly to the processing time.

5 Conclusions

This paper presents a survey on recent reconstruction algorithms that are ableto reconstruct dense and full 3D models. The aim of this survey is broad sincemethods that exploit variant types of equipment are included. However, exceptfor the equipment-based categorization a further classification for each categoryis given based on the particular techniques that methods use to recover the 3Dshape of objects.

Acknowledgments. This work was supported by the 3DLifeEU Network ofExcellence (NoE) project.


Fig. 5. Reconstruction results for a time frame.

References

1. Kim, T., Seo, Y., Lee, S., Yang, Z., Chang, M.: Simultaneous registration of multipleviews with markers. Computer-Aided Design, Elsevier, 41(4), 231–239, (2009)

2. Akca, D.: Full automatic registration of laser scanner point clouds. Optical 3-DMeasurement Techniques VI, 1, 330-337, (2003)

3. Bienert, A., Maas, H.: Methods for the Automatic Geometric Registration of Ter-restrial Laserscanner Point Clouds in Forest Stands. In ISPRS Workshop, (2009)

4. Stamos, I., Liu, L., Chao, C., Wolberg, G., Yu, G., Zokai, S.: Integrating AutomatedRange Registration with Multiview Geometry for the Photorealistic Modeling ofLarge-Scale Scenes. IJCV, Springer, 78(2/3), 237–260, (2008)

5. Chen, C., Stamos, I.: Semi-automatic range to range registration: a feature-basedmethod. In the 5th international conference on 3-D digital imaging and modeling,254-261, (2005)

6. Jiang, J., Cheng, J., Chen, X.: Registration for 3-D point cloud using angular-invariant feature. Neurocomputing, Elsevier, 72, 3839–3844 (2009)

7. Smith, E., King, B., Stewart, C., Radke, R.: Registration of Combined Range-Intensity Scans: Initialization through Verification. Computer Vision and ImageUnderstanding, 110, 226-244 (2008)

8. Seo, J.,Sharp, G., Lee, S.: Range data registration using photometric features. InProc. of CVPR, 2, 1140-1145, (2005)

9. Bendels, G., Degener, P., Wahl, R., Kortgen, M.: Klein R., Image-based registrationof 3d-range data using feature surface elements. In Proc. of International Symposiumon Virtual Reality, Archaeology and Cultural Heritage VAST, (2004)

10. Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fittingwith applications to image analysis and automated cartography. Graphics and ImageProcessing 24(6), 381395, (1981)

11. Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. Interna-tional Journal of Computer Vision, 60, 91-110 (2004)


12. Stewart, C., Tsai, C., Roysam, B.: The Dual-Bootstrap iterative closest point al-gorithm with application to retinal image registration. IEEE Trans. Med. Imag.,22(11), 1379–1394, (2003)

13. , Johnson, A., Hebert, M.: Surface registration by matching oriented points. InInternational Conference on Recent Advances in 3-D Imaging and Modelling, 121-128, (1997)

14. Mian, A., Bennamoun, M., Owens, R.: A Novel Representation and FeatureMatching Algorithm for Automatic Pairwise Registration of Range Images. IJCV,Springer, 66(1), 19–40, (2006)

15. Makadia, A., Patterson, A., Daniilidis, K.: Fully automatic registration of 3d pointclouds. In CVPR, (2006)

16. Kang, S., Ikeuchi, K.: The complex EGI: A new representation for 3-D pose deter-mination. TPAMI, 15(7), 707-721, (1993)

17. Kostelec, P., Rockmore, D.: FFTs on the Rotation Group. In Working Paper Series,Santa Fe Institute, (2003)

18. Wyngaerd, V., Gool, V.: Combining texture and shape for automatic crude patchregistration. In Proc. Fourth Int. Conf. on 3DIM, pages, 179–186, (2003)

19. Besl, P., McKay, N.: A method for registration of 3-d shapes. IEEE Trans. PatternAnal. Mach. Intell, 14(2), 239-256, (1992)

20. Rusinkiewicz, S., Levoy, M.: Effcient variants of the ICP algorithm. In Proc. ThirdInt. Conf. on 3DIM, 224–231, (2001)

21. Pottmann, H., Huang, Q., Yang, Y., Hu, S.:Geometry and convergence analysis ofalgorithms for registration of 3d shapes. Int. J. Comp. Vis., 67(3), 277–296, (2006)

22. Okatani, I., Sugimoto, A.: Registration of Range Images that Preserves Local Sur-face Structures and Color, In 3DPTV, 786–796, (2004)

23. Sharp, G., Lee, S., Wehe, D.: Maximum-Likelihood Registration of Range Imageswith Missing Data. TPAMI, 30(1), 120–130, (2008)

24. Matusik, W., Buehler, C., Raskar, R., Gortler, S., McMillan, L.: Image-based visualhulls. In SIGGRAPH Proceedings, 369-374, (2000)

25. Franco, J., Boyer, E.: Fusion of multi-view silhouette cues using a space occupancygrid. In ICCV, 2, 1747-1753, (2005)

26. Ishikawa, T., Yamazawa, K., Yokoya, N.: Real-time generation of novel views of adynamic scene using morphing and visual hull. Proc. ICIP, 1013-1016, (2005)

27. Grauman, K., Shakhnarovich, G., Darrell, T.: A Bayesian Approach to Image-Based Visual Hull Reconstruction. In CVPR, 187–194, (2003)

28. Lai, P., Yilmaz, A.: Shape Recovery Using Rotated Slicing Planes, Int. Congresson Image and Signal Processing, (2009)

29. Kutulakos, K., Seitz, S.: A Theory of Shape by Space Carving. In InternationalJournal of Computer Vision, 38(3), 199–218, (2000)

30. Bhotika, R., Fleet, D., Kutulakos, K.: A Probabilistic Theory of Occupancy andEmptiness. In Proc. ECCV, 3, 112–132, (2002)

31. Yang, R., Pollefeys, M., Welch, G.: Dealing with textureless regions and specularhighlights - A progressive space carving scheme using a novel photo-consistencymeasure. In ICCV, 576–584, (2003)

32. Ladikos, A.,Benhimane, S., Navab, N.: Multi-View Reconstruction using Narrow-Band Graph-Cuts and Surface Normal Optimization, In BMVC, (2008)

33. Vogiatzis, G., Torr, P., Cipolla, R.: Multi-view stereo via volumetric graph-cuts.In CVPR, 391-398, (2005)

34. Hornung, A., Kobbelt, L.: Hierarchical volumetric multi-view stereo reconstructionof manifold surfaces based on dual graph embedding. In CVPR, 1, 503-510, (2006)


35. Zeng, G., Paris S., Quan, L., Sillion, F.: Progressive surface reconstruction fromimages using a local prior. In ICCV, 12301237, (2005)

36. Zeng, G., Paris S., Quan, L., Sillion, F.: Accurate and scalable surface representa-tion and reconstruction from images. TPAMI, 29(1), 141-158, (2007)

37. Jin, H., Soatto, S., Yezzi A.: Multi-view stereo reconstruction of dense shape andcomplex appearance. International Journal of Computer Vision, 63(3), 175–189,(2005)

38. Yu, T., Ahuja, N., Chen, W.: SDG Cut: 3d reconstruction of non-lambertian objectsusing graph cuts on surface distance grid. In Proc. CVPR, (2006)

39. Paris, S., Sillion, F., Quan, L.: A surface reconstruction method using global graphcut optimization. IJCV, Springer, 66(2), 141–161, (2006)

40. Pons, J., Keriven , R., Faugeras , O.: Multi-view stereo reconstruction and sceneflow estimation with a global image-based matching score. IJCV, Springer, 72(2),179-193, (2007)

41. Osher, S., Sethian, J.: Fronts propagating with curvature-dependent speed: algo-rithms based on hamilton-jacobi equations. J. of Comp. Physics, 79, 12–49, (1988)

42. Lhuillier, M., Quan, L.: Surface Reconstruction by Integrating 3D and 2D Data ofMultiple Views. In ICCV, (2003)

43. Lhuillier, M., Quan, L.: A Quasi-Dense Approach to Surface Reconstruction fromUncalibrated Images, TPAMI, 27(3), 418–433, (2005)

44. Kolev K.,Klodt, M.,Brox, T.,Cremers, D.:Continuous Global Optimization in Mul-tiview 3D Reconstruction. IJCV, Springer, 84, 8096, (2009)

45. Tran, S., Davis, L.: 3d surface reconstruction using graph cuts with surface con-straints. In ECCV, 2, 218-231, (2006)

46. Sinha, S., Mordohai, P., Pollefeys, M.: Multi-view stereo via graph cuts on the dualof an adaptive tetrahedral mesh. In ICCV, (2007)

47. Vogiatzis, G., C. Hernandez, Torr, P., and Cipolla, R.: Multiview stereo via vol-umetric graph-cuts and occlusion robust photo-consistency. TPAMI, 29(12), 2241–2246, (2007)

48. Zhang, Z.:A flexible new technique for camera calibration. In IEEE Transactionson Pattern Analysis and Machine Intelligence, 22(11), 1330–1334, (2000)

49. Matsuyama, T., Wu, X., Takai, T., Wada, T.: Real-Time Dynamic 3-D ObjectShape Reconstruction and High-Fidelity Texture Mapping for 3-D Video. In IEEETransactions on Circuits and Systems for Video Technology, 14(3), 357–369, (2004)

50. Lorensen, W., Cline, H.: Marching cubes: A high resolution 3d surface constructionalgorithm. Computer Graphics, 21(4):163-169, 1987.

51. Kim, K., Chalidabhongse, T., Harwood, D., Davis L.: Real-time foregroundback-ground segmentation using codebook model. Real-Time Imaging, 11, 172-185, (2005)

52. Autodesk 3ds Max, http://usa.autodesk.com/53. VRMesh - For point cloud and triangle mesh processing, http://www.vrmesh.com/54. 3DLife-Towards a VCE for Media Internet, http://www.3dlife-noe.eu/3DLife/

state-of-the-art algorithms for complete 3d model ...engage.miralab.ch/papers/state-of-the-art...

Documents