optimized stereo reconstruction using 3d graphics hardware
TRANSCRIPT
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
1/49
Optimized Stereo Reconstruction
Using 3D Graphics Hardware
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
2/49
Abstract
In this paper we describe significantenhancements of our recently developed
stereo reconstruction software exploiting
features of commodity 3D graphics hardware
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
3/49
The aim of our research is the acceleration of
Dense reconstructions from stereo images
The implementation of our hardware basedmatching procedure avoids loss of accuracy
due to the limited precision of calculations
performed in 3D graphics processors
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
4/49
In this work we target on improving the time
to obtain reconstructions for images with small
resolution, and we covered other enhancements
to increase the speed for highly detailed
reconstructions.
Our optimized implementation is able to generate
up to 130 000 depth values per second on standard
PC hardware
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
5/49
1 Introduction
The automatic creation of digital 3D models
from images of real objects comprise a main
step in the generation of virtual environments
and is still a challenging research topic
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
6/49
Many stereo reconstruction algorithms toobtain models from images were
proposed,and almost all approaches are
based on traditional computations
performed in the main CPU
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
7/49
Very recently few methods were proposed,that exploit the parallelism and special-
purpose circuits found in current 3D graphics
hardware to accelerate the reconstruction
process
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
8/49
Although the method is still significantlyslower than the planesweeping approach
proposed by Yang et al. [19] (using a five
camera setup), our approach provides
goodresults with a calibrated stereo setup.
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
9/49
2 Related Work
2.1 Reconstruction from Stereo Images Stereomatching is the process offinding correspond-
ing or homologous points in two or more
images and is an essential task in computer
vision.
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
10/49
We distinguish between feature and areabased methods.
Feature based methods incline to sparse
but accurate results, whereas area based
methods deliver dense point clouds.
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
11/49
Feature based methods
The former methods use local features thatare extracted from intensity or color
distribution. These extracted feature vectors
are utilized to solve the correspondence
problem
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
12/49
Area based methods
Area based methods employ a similarityconstraint, where corresponding points are
assumed to have similar intensity or color
within a local window.
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
13/49
The Triclops vision system
consists of a hardware setup with three
cameras and appropriate software for
realtime stereo matching. The system is able
to generate depth images at a rate of about
20Hz for images up to 320x240 pixels on
current PC hardware
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
14/49
The Triclops vision system(cont..)
The software exploits the particularorientation of the cameras and MMX/SSE
instructions available on current CPUs.In
contrast to the Triclops system our approachcan handle images from cameras with
arbitrary relative orientation
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
15/49
Yang et al. [19] developed a fast stereoreconstruction method performed in 3D
hardware by utilizing a plane sweep approach
to find correct depth values.The number ofiterations is linear in the requested resolution
of depth values, therefore this method
islimited to rather coarse depth estimation
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
16/49
2.2 Accelerated Computations in
Graphics Hardware
Even before programmable graphics hardwarewas available, the fixed function pipeline of 3D
graphics processors was utilized to accelerate
numerical calculations [5, 6] and even toemulate programmable shading
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
17/49
Recently several authors identified currentgraph-ics hardware as a kind of auxiliaryprocessing unitto perform SIMD (single
instruction, multiple data)operationsefficiently. Thompson et al. [17] implementedseveral non-graphical algorithms to run onprogrammable graphics hardware and profiled
the execution times against CPU basedimplementa-tions.
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
18/49
3. Overview of Our Matching
Procedure
The original implementation of our 3Dgraphics hard-ware based matching procedure
is described in [20].The matching algorithm
essentially warps the left and the right imageonto the current triangular 3D
meshhypothesis according to the relative
orientation be-tween the images
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
19/49
The projection of the mesh into the left imageconstitutes always a regular grid (i.e. the left
image is never distorted). The warped images
are subtracted and the sum of absolutediffenences is calculated in a window around
the mesh vertices
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
20/49
The main steps of the algorithm, namelyimage warping and differencing, sum of
absolute differences calculation, and finding
the optimal depth value are performed on thegraphics processing unit. The main CPU
executes the control flow and updates the
current mesh hypothesis
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
21/49
In order to minimize transfer between hostand graphics memory most changes applied to
the current mesh geometry are only
appliedvirtually without requiring real updateof 3D vertices.For details we refer to our
earlier work
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
22/49
The control flow of the matching
algorithm
1 .The outermost loop adds the hierarchical
approach to our method and reduces the
probability of local minima e.g. imposed by
repetitivestructures. In every iteration themesh and image resolutions are doubled. The
refined mesh is obtained by linear (and
optionally median) filtering of the coarser one
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
23/49
The control flow of the matching
algorithm
The inner loop chooses a set of independentvertices to be modified and updates the depth
val-ues of these vertices after performing the
inner-most loop. In order to avoid influence ofdepthchanges of neighboring vertices, only
one quarterof the vertices can be modified
simultaneously
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
24/49
The control flow of the matching
algorithm
The innermost loop evaluates depth variations
for candidate vertices selected in the enclosing
loop. The best depth value is determined by re-peated image warping and error calculation wrt.
the tested depth hypothesis. The body of this
loop runs entirely on the 3D graphics processingunit (GPU).
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
25/49
4 Amortized Difference Image
Generation
For larger image resolutions (e.g. 1024 1024)rendering of the corresponding meshgenerated by the sampling points takes a
considerable amount of time.In the1megapixel case the mesh consists ofapproximately 131 000 triangles, which mustbe rendered for
every depth value (several hundred times intotal)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
26/49
Especially on mobile graphic boards mesh
processing implies a severe performance penalty:
stereo matching of two 256256 images has
comparable speed ona desktop GPU and on a usually slower mobile
GPU
intended for laptop PCs, but matching 1-megapixelimages requires two times longer on the mobile
GPU
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
27/49
In order to reduce the number of meshdrawings up to four depth values are
evaluated in one pass.We use multitexturing
facilities to generate four texture coordinatesfor different depth values within the vertex
program
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
28/49
The fragment shader calculates the absolutedifferences for these deformations simultane-
ously and stores the results in the four color
channels(red, green, blue and alpha)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
29/49
Note that the mesh hypothesis is updatedinfrequently and the actually evaluated mesh
is generated within the vertex shader by
deforming the incoming vertices according tothe current displacement
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
30/49
5. Chunked Image Transforms
In contrast to Yang and Pollefeys [18] wecalculate the error within a window explicitly
using multiple passes
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
31/49
In every pass four adjacent pixels areaccumulated and the result is written to a
temporary off-screen frame buffer (usually
called pixel buffer or P-buffer for short)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
32/49
It is possible to set pixel buffers as destinationfor rendering operations (write access)or to
bind a pixel buffer as a texture (read
access),but a combined read and write accessis not available.In the default setting the
window size is 88, therefore 3 passes are
required
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
33/49
Note that we use specific encoding ofsummed values to avoid overflow due to the
limited accuracy of one color channel. We
refer to [20] for detail
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
34/49
Executing this multipass pipeline to obtain the
sum of absolute differences within a window
requires several P-buffer activations to select
the correct target buffer for writing. These
switches turned out to be relatively expensive
(about 0.15ms per switch
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
35/49
Incombination with the large number ofswitches the total time spent within these
operations comprise a significant fraction of
the overall matching time (about50% for256256 images). If the number of these
operations can be optimized, one can expect
substantial increase in performance of thematching procedure
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
36/49
Instead of directly executing the pipeline inthe innermost loop (requiring 5 P-buffer
switches) we reorganize the loops to
accumulate several intermediate results inone larger buffer with temporary results
arranged in tiles (see Figure 2)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
37/49
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
38/49
Therefore P-buffer switches are amortizedover several iterations of the innermost loop.
This flexibility in the control flow is completely
transparent and needs not to be codedexplicitly within the software. Those stages in
the pipeline waiting for the input buffer to
become ready are skipped automatically
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
39/49
6. Minimum Determination Us-
ing the Depth Test
Thc hin ban u ca chng ti, chng tis dng mt mnhchng trnh cp nht cc li ti thiu
v ti uchiu su gi tr tm thy cho n nay.Cch tip cn ny hot ng trn cc thin thoi di ng ha, nhng n l
kh chm do kch hot b m cnthit P-(k t khi vic tnh ton ti thiu cth khng c thc hin ti ch)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
40/49
Z-buffer tests allow conditional updates of theframe buffer in-place, but only very currentgraphics cards support user-defined
assignment of z-values within the fragmentshader
(e.g. by using the ARB_FRAGMENT_PROGRAMOpenGL extension). Older hardware always
interpolates z- values from the given geometry(vertices)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
41/49
Using the depthtest provided by 3D graphicshardware, the given index of the currently
evaluated depth variation and the
corresponding sum of absolute differences iswritten into the destination buffer, if the
incoming erroris smaller than the minimum
already stored in the buffer
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
42/49
Therefore a point-wise optimum for evaluateddepth values for mesh vertices can be
computed easily and efficiently
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
43/49
7. Disparity Estimation
A rather straightforward extension of ourepipolarmatching procedure is disparity
estimation for two uncalibatrated images
without known relative orientation. In orderto address this setting the 3D meshhypothesis
(i.e. a 2D to 1D (depth) mapping) is replaced
by a 2D to 2D (texture coordinates)mapping,but the basic algorithm remains the
same.
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
44/49
Instead of an 1D search along the epipolar linethe procedure performs a 2D search for
appropriate disparity values
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
45/49
In this setting the rather complex vertexshader to generate texture coordinates for the
right image used for epipolar matching can be
omitted. In every iteration the current 2Ddisparity map hypothesis is modified in
vertical and horizontal direction and evaluated
(Figure 4).
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
46/49
With this approach it is possible to findcorresponding regions in typical small base-
line settings (i.e. only small change in position
and rotation between the two images)
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
47/49
8. Results
We run timing experiments on a desktop PCwith an Athlon XP 2700 and an ATI Radeon
9700 and on a laptop PC with a mobile Athlon
XP 2200 and an ATI Radeon 9000 Mobility.Table 1 depicts timingresults for those two
hardware setups and illustratesthe effect of
the various optimizations mentioned intheprevious sections
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
48/49
The timings were obtainedfor epipolar stereomatching of the dataset shownin Figure 5. The
total number of evaluated meshhypothesis is
224 (256x256), 280 (512x512) and336(1024x1024). In the highest resolution
(1024x1024)each vertex is actually tested with
84 depth valuesout of a range ofapproximately 600 possible values
-
8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware
49/49
Because of limitations in graphics hardwarewe arecurrently restricted to images with
power of two dimensions.