optimized stereo reconstruction using 3d graphics hardware

Upload: minh-tung

Post on 07-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    1/49

    Optimized Stereo Reconstruction

    Using 3D Graphics Hardware

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    2/49

    Abstract

    In this paper we describe significantenhancements of our recently developed

    stereo reconstruction software exploiting

    features of commodity 3D graphics hardware

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    3/49

    The aim of our research is the acceleration of

    Dense reconstructions from stereo images

    The implementation of our hardware basedmatching procedure avoids loss of accuracy

    due to the limited precision of calculations

    performed in 3D graphics processors

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    4/49

    In this work we target on improving the time

    to obtain reconstructions for images with small

    resolution, and we covered other enhancements

    to increase the speed for highly detailed

    reconstructions.

    Our optimized implementation is able to generate

    up to 130 000 depth values per second on standard

    PC hardware

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    5/49

    1 Introduction

    The automatic creation of digital 3D models

    from images of real objects comprise a main

    step in the generation of virtual environments

    and is still a challenging research topic

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    6/49

    Many stereo reconstruction algorithms toobtain models from images were

    proposed,and almost all approaches are

    based on traditional computations

    performed in the main CPU

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    7/49

    Very recently few methods were proposed,that exploit the parallelism and special-

    purpose circuits found in current 3D graphics

    hardware to accelerate the reconstruction

    process

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    8/49

    Although the method is still significantlyslower than the planesweeping approach

    proposed by Yang et al. [19] (using a five

    camera setup), our approach provides

    goodresults with a calibrated stereo setup.

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    9/49

    2 Related Work

    2.1 Reconstruction from Stereo Images Stereomatching is the process offinding correspond-

    ing or homologous points in two or more

    images and is an essential task in computer

    vision.

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    10/49

    We distinguish between feature and areabased methods.

    Feature based methods incline to sparse

    but accurate results, whereas area based

    methods deliver dense point clouds.

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    11/49

    Feature based methods

    The former methods use local features thatare extracted from intensity or color

    distribution. These extracted feature vectors

    are utilized to solve the correspondence

    problem

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    12/49

    Area based methods

    Area based methods employ a similarityconstraint, where corresponding points are

    assumed to have similar intensity or color

    within a local window.

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    13/49

    The Triclops vision system

    consists of a hardware setup with three

    cameras and appropriate software for

    realtime stereo matching. The system is able

    to generate depth images at a rate of about

    20Hz for images up to 320x240 pixels on

    current PC hardware

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    14/49

    The Triclops vision system(cont..)

    The software exploits the particularorientation of the cameras and MMX/SSE

    instructions available on current CPUs.In

    contrast to the Triclops system our approachcan handle images from cameras with

    arbitrary relative orientation

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    15/49

    Yang et al. [19] developed a fast stereoreconstruction method performed in 3D

    hardware by utilizing a plane sweep approach

    to find correct depth values.The number ofiterations is linear in the requested resolution

    of depth values, therefore this method

    islimited to rather coarse depth estimation

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    16/49

    2.2 Accelerated Computations in

    Graphics Hardware

    Even before programmable graphics hardwarewas available, the fixed function pipeline of 3D

    graphics processors was utilized to accelerate

    numerical calculations [5, 6] and even toemulate programmable shading

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    17/49

    Recently several authors identified currentgraph-ics hardware as a kind of auxiliaryprocessing unitto perform SIMD (single

    instruction, multiple data)operationsefficiently. Thompson et al. [17] implementedseveral non-graphical algorithms to run onprogrammable graphics hardware and profiled

    the execution times against CPU basedimplementa-tions.

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    18/49

    3. Overview of Our Matching

    Procedure

    The original implementation of our 3Dgraphics hard-ware based matching procedure

    is described in [20].The matching algorithm

    essentially warps the left and the right imageonto the current triangular 3D

    meshhypothesis according to the relative

    orientation be-tween the images

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    19/49

    The projection of the mesh into the left imageconstitutes always a regular grid (i.e. the left

    image is never distorted). The warped images

    are subtracted and the sum of absolutediffenences is calculated in a window around

    the mesh vertices

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    20/49

    The main steps of the algorithm, namelyimage warping and differencing, sum of

    absolute differences calculation, and finding

    the optimal depth value are performed on thegraphics processing unit. The main CPU

    executes the control flow and updates the

    current mesh hypothesis

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    21/49

    In order to minimize transfer between hostand graphics memory most changes applied to

    the current mesh geometry are only

    appliedvirtually without requiring real updateof 3D vertices.For details we refer to our

    earlier work

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    22/49

    The control flow of the matching

    algorithm

    1 .The outermost loop adds the hierarchical

    approach to our method and reduces the

    probability of local minima e.g. imposed by

    repetitivestructures. In every iteration themesh and image resolutions are doubled. The

    refined mesh is obtained by linear (and

    optionally median) filtering of the coarser one

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    23/49

    The control flow of the matching

    algorithm

    The inner loop chooses a set of independentvertices to be modified and updates the depth

    val-ues of these vertices after performing the

    inner-most loop. In order to avoid influence ofdepthchanges of neighboring vertices, only

    one quarterof the vertices can be modified

    simultaneously

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    24/49

    The control flow of the matching

    algorithm

    The innermost loop evaluates depth variations

    for candidate vertices selected in the enclosing

    loop. The best depth value is determined by re-peated image warping and error calculation wrt.

    the tested depth hypothesis. The body of this

    loop runs entirely on the 3D graphics processingunit (GPU).

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    25/49

    4 Amortized Difference Image

    Generation

    For larger image resolutions (e.g. 1024 1024)rendering of the corresponding meshgenerated by the sampling points takes a

    considerable amount of time.In the1megapixel case the mesh consists ofapproximately 131 000 triangles, which mustbe rendered for

    every depth value (several hundred times intotal)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    26/49

    Especially on mobile graphic boards mesh

    processing implies a severe performance penalty:

    stereo matching of two 256256 images has

    comparable speed ona desktop GPU and on a usually slower mobile

    GPU

    intended for laptop PCs, but matching 1-megapixelimages requires two times longer on the mobile

    GPU

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    27/49

    In order to reduce the number of meshdrawings up to four depth values are

    evaluated in one pass.We use multitexturing

    facilities to generate four texture coordinatesfor different depth values within the vertex

    program

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    28/49

    The fragment shader calculates the absolutedifferences for these deformations simultane-

    ously and stores the results in the four color

    channels(red, green, blue and alpha)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    29/49

    Note that the mesh hypothesis is updatedinfrequently and the actually evaluated mesh

    is generated within the vertex shader by

    deforming the incoming vertices according tothe current displacement

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    30/49

    5. Chunked Image Transforms

    In contrast to Yang and Pollefeys [18] wecalculate the error within a window explicitly

    using multiple passes

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    31/49

    In every pass four adjacent pixels areaccumulated and the result is written to a

    temporary off-screen frame buffer (usually

    called pixel buffer or P-buffer for short)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    32/49

    It is possible to set pixel buffers as destinationfor rendering operations (write access)or to

    bind a pixel buffer as a texture (read

    access),but a combined read and write accessis not available.In the default setting the

    window size is 88, therefore 3 passes are

    required

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    33/49

    Note that we use specific encoding ofsummed values to avoid overflow due to the

    limited accuracy of one color channel. We

    refer to [20] for detail

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    34/49

    Executing this multipass pipeline to obtain the

    sum of absolute differences within a window

    requires several P-buffer activations to select

    the correct target buffer for writing. These

    switches turned out to be relatively expensive

    (about 0.15ms per switch

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    35/49

    Incombination with the large number ofswitches the total time spent within these

    operations comprise a significant fraction of

    the overall matching time (about50% for256256 images). If the number of these

    operations can be optimized, one can expect

    substantial increase in performance of thematching procedure

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    36/49

    Instead of directly executing the pipeline inthe innermost loop (requiring 5 P-buffer

    switches) we reorganize the loops to

    accumulate several intermediate results inone larger buffer with temporary results

    arranged in tiles (see Figure 2)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    37/49

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    38/49

    Therefore P-buffer switches are amortizedover several iterations of the innermost loop.

    This flexibility in the control flow is completely

    transparent and needs not to be codedexplicitly within the software. Those stages in

    the pipeline waiting for the input buffer to

    become ready are skipped automatically

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    39/49

    6. Minimum Determination Us-

    ing the Depth Test

    Thc hin ban u ca chng ti, chng tis dng mt mnhchng trnh cp nht cc li ti thiu

    v ti uchiu su gi tr tm thy cho n nay.Cch tip cn ny hot ng trn cc thin thoi di ng ha, nhng n l

    kh chm do kch hot b m cnthit P-(k t khi vic tnh ton ti thiu cth khng c thc hin ti ch)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    40/49

    Z-buffer tests allow conditional updates of theframe buffer in-place, but only very currentgraphics cards support user-defined

    assignment of z-values within the fragmentshader

    (e.g. by using the ARB_FRAGMENT_PROGRAMOpenGL extension). Older hardware always

    interpolates z- values from the given geometry(vertices)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    41/49

    Using the depthtest provided by 3D graphicshardware, the given index of the currently

    evaluated depth variation and the

    corresponding sum of absolute differences iswritten into the destination buffer, if the

    incoming erroris smaller than the minimum

    already stored in the buffer

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    42/49

    Therefore a point-wise optimum for evaluateddepth values for mesh vertices can be

    computed easily and efficiently

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    43/49

    7. Disparity Estimation

    A rather straightforward extension of ourepipolarmatching procedure is disparity

    estimation for two uncalibatrated images

    without known relative orientation. In orderto address this setting the 3D meshhypothesis

    (i.e. a 2D to 1D (depth) mapping) is replaced

    by a 2D to 2D (texture coordinates)mapping,but the basic algorithm remains the

    same.

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    44/49

    Instead of an 1D search along the epipolar linethe procedure performs a 2D search for

    appropriate disparity values

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    45/49

    In this setting the rather complex vertexshader to generate texture coordinates for the

    right image used for epipolar matching can be

    omitted. In every iteration the current 2Ddisparity map hypothesis is modified in

    vertical and horizontal direction and evaluated

    (Figure 4).

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    46/49

    With this approach it is possible to findcorresponding regions in typical small base-

    line settings (i.e. only small change in position

    and rotation between the two images)

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    47/49

    8. Results

    We run timing experiments on a desktop PCwith an Athlon XP 2700 and an ATI Radeon

    9700 and on a laptop PC with a mobile Athlon

    XP 2200 and an ATI Radeon 9000 Mobility.Table 1 depicts timingresults for those two

    hardware setups and illustratesthe effect of

    the various optimizations mentioned intheprevious sections

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    48/49

    The timings were obtainedfor epipolar stereomatching of the dataset shownin Figure 5. The

    total number of evaluated meshhypothesis is

    224 (256x256), 280 (512x512) and336(1024x1024). In the highest resolution

    (1024x1024)each vertex is actually tested with

    84 depth valuesout of a range ofapproximately 600 possible values

  • 8/4/2019 Optimized Stereo Reconstruction Using 3D Graphics Hardware

    49/49

    Because of limitations in graphics hardwarewe arecurrently restricted to images with

    power of two dimensions.