stereo and multiview sequence processing. outline stereopsis stereo imaging principle disparity...
Post on 21-Dec-2015
218 views
TRANSCRIPT
Stereo and Multiview Sequence Processing
Outline
Stereopsis Stereo Imaging Principle Disparity Estimation Intermediate View Synthesis Stereo Sequence Coding
Stereopsis Retinal disparityRetinal disparity
The horizontal distance between the corresponding left and right image points of the superimposed retinal images.
The disparity is zero if the eyes are converged.
Stereopsis The sense of depth combined
from two different perspective views by the mind.
Arbitrary Camera Configuration
Stereo Imaging Principle (1)
Xl = [R]lX + Tl, Xr = [R]rX + Tr
Cw
CrCl
X
Z
Y
X
ll
l
y
xx
rr
r
y
xx
rliZ
YFy
Z
XFx
i
iii
i
iii , ,
rl
l
l
l
l
rll
r
r
r
r
r
F
y
x
F
Z
F
y
x
F
ZT
]R[
Xr = [R]rlXl + Trl, where[R]rl = [R]r[R]l
T, Trl = Tr –[R]r[R]lTTl
Given xl and xr
Zr and Zl Xl, Xr, Yl, Yr (X, Y, Z)
(Perspective projection)
([R]l , [R]r : Orthonormal)
12.2.1
12.2.2
12.2.3
12.2.4
12.2.5
Parallel Camera Configuration
Stereo Imaging Principle (2)
xl xr
X
Yl Yw YrZlZw Zr
Xl Xw XrCl Cw Cr
B
Z
YFyyy
Z
BXFx
Z
BXFx
ZZZYYYB
XXB
XX
rlrl
rlrlrl
,2/
,2/
;,,2
,2
Z
FBxxd rlx
xxx
rl
d
FBZ
d
ByY
d
xxBX
,,
2
)(
12.2.6
12.2.7
xl xr
F
B
Cl Cw Cr
ZX
X+B/2
X-B/2
X
ZlZw Zr
12.2.8
12.2.9
3-D view X-Z view (Y=0)
Results of eq. 12.2.8 Basis for derive the depth from the dispari
ty info The disparity value of a 3-D point (X, Y, Z) i
s independent of the X and Y coordinates, and is inversely proportional to the Z value.
The range of the disparity increases with the baseline B, the distance between the two cameras.
dx > 0
Converging Camera Configuration
Stereo Imaging Principle (3)
xl xr
X
YlYw YrZl
Zw Zr
XlXw XrCl Cw Cr
2/sin
0
2/cos
,
cos0sin
010
sin0cos
2/sin
0
2/cos
,
cos0sin
010
sin0cos
B
B
B
B
lr
ll
TR
TR
ZBX
YFy
ZBX
ZBXFx
ZBX
YFy
ZBX
ZBXFx
rr
ll
cos)2/(sin,
cos)2/(sin
sin)2/(cos
cos)2/(sin,
cos)2/(sin
sin)2/(cos
xl xr
BCl Cw Cr
X
X
Zl
Zw
Zr
Z
12.2.10
12.2.11
12.2.2 and 12.2.4
3-D view X-Z view (Y=0)12.2.12
Stereo Imaging Principle (4)
Epipolar Geometry Epipolar Constraint
For any imaged point that falls on the left epipolar line, its corresponding pixel in the right image must be on the right epipolar line
Fundamental matrix The relation between an im
age point and its epipolar line can be characterized by a 3 by 3 matrix, [F]
xl
xr
X
el
er
epl
epr
Cl
Cr
: Epipolar planeepl , epr: Epipolar lineel: Left epipoleer: Right epipole
l
r
0~][~ ,0~][~ rTT
llTr xxxx FF
]1,[~ TT xx
Stereo Imaging Principle (4)
Parallel camera Epipoles are at infinity, and epipolar
lines are parallel For any given point, the left and right
epipolar lines associated with this point are horizontal lines with the same y coordinate as this point
This can simplify the disparity estimation problem
el
er
Cl
Cr
Disparity Estimation (1) Constraints on Disparity Distribution Epipolar constraint Unidirectionality with parallel cameras
With the parallel camera configuration, the DV has only horizontal components and is always positive.
Ordering constraint Let xr,1 and xr,2 be two points in the right image on the
same horizontal line. xr,1< xr,2 xl,1< xl,2 dx,2 > dx,1+xr,1- xr,2
xl,2 > xl,1 xl,2 - xr,2 > xl,1 - xr,2 dx,2 > xl,1 - xr,1 +xr,1- xr,2
Disparity Estimation (2)
Models for the Disparity Function A simple case: The surface of the imaged
scene is approximated by a plane.
cbYaXYXZ ),(
FbyFxa
BacyxZ
cF
Zyb
B
F
ZxayxZ
rrrr
rrrr
1
2),(
)2
(),(
)(2
1),( rrrrx byaxF
aBcyxd
12.3.1
12.2.6 and 12.2.7
12.3.3
12.3.2
12.3.4
The disparity function is affine in the image coordinate when the surface is a plane
Patch: the planar condition holds
Divided into small patches such that each patch is approximately planar
The disparity estimation problem the estimation of three affine parameters for
each patch the estimation of the disparity (dx only) at thr
ee corner points the estimation the disparity at nodal points,
and the disparity function within each patch can then be interpolated form the nodal points using the affine model
Disparity Estimation (3) Block-Based Approach
A disparity function is described by a constant or a low-order polynomial
determined by minimizing the error between the two views after warping, based on the estimated disparity function
Solved by exhaustive or gradient-descent search with constraints listed in Page 10.
Search range should be much larger. This model is only appropriate for the flat
surface that is parallel with the image plane. This model is good when the block size is
small.
Disparity Estimation (4)
Two-dimensional mesh-based approachX
xl xrBm,l Bm,r
Finding nodal displacements by minimizingthe disparity-compensated prediction errorbetween corresponding elements, summed over the FOUR elements attached to this node
Parallel set-up, only horizontal disparities must be searched
Original left Original right
Regular mesh on the left image
Corresponding mesh on the right image
Predictive right image by BMA (32.03 dB)
Predictive right image by mesh (27.48 dB)
The mesh-based scheme yields a visually more accurate prediction
Disparity Estimation (5) Intra-Line Edge
Matching Using Dynamic Programming The stereo matching
process can be considered as finding a path in a graph.
# of edge points in the left image
# of edge points in the right image
Right scan line
Left sca
n lin
e
Disparity Estimation (6)
Joint Structure and Motion Estimation Modeling the surface of the imaged
object with a 3-D mesh.
The 3-D mesh projects to 2-D meshes in the left and right images
Intermediate View Synthesis (1)
Naïve approach Linear interpolation without considering
disparity
Dcl is the baseline distance from the central to the l
eft view yielding blurred images
),()()()()( xxxxx rrllc ww
)(1)(,)( xxx lrcrcl
crl ww
DD
Dw
Disparity-compensated interpolation
Intermediate View Synthesis (2)
))(()())(()()( xdxxxdxxx crrrclllc ww
x + dcr(x)x + dcl(x)
,0
,1
,
)(crcl
cr
l
DD
D
w x
if x is visible in both views,
if x is visible only in the left views,
if x is visible only in the right views,
Suppose dcl(x) and dcr(x) are known
In reality, only dlr(x) can be estimated. It is not easy to generate dcl(x) and dcr(x) from dlr(x)
Intermediate View Synthesis (3) Solved if dlr(x) is estimated by the mesh-based appr
oach
xl,nxc,n xr,n
nrcrcl
clnl
crcl
crnc DD
D
DD
D,,, xxx
Stereo Sequence Coding (1)
Multiview profile of MPEG-2
Coding left view seq. Sl, first, for the right view seq., each frame is predicated from the corresponding frame in Sl, based on an estimated disparity field and the prediction error image are coded.
P B B B
I B B P
Rightview
Leftview
Stereo Sequence Coding (2)
Incomplete 3-D representation of multiview sequences: augmented text map, region segmentation, disparity info for each region
Putting the texture maps of all the different regions in an augmented image.
Originalleft
Originalright
Augmentedtexture
Disparitymap
Stereo Sequence Coding (3)
Mixed-resolution coding Based on the HVS, the resolution of one of the
two images can be considerably reduced when the image is presented for a short time
One of the left and right sequences is coded at a high resolution, while the other is first down-sampled spatially and temporally, then coded
High resolution
Low resolution
Leftsequence
Rightsequence
Stereo Sequence Coding (4)
3-D object-based coding
Objectsegmentation
Motion and structure
estimation
Shape and motion
parameter coding
Reference texture image
extraction
Reference texture image
coding
Coded view synthesis
Synthesis error image coding
Left and rightsequences
Shape andmotion bits
Texture bits
Synthesis error bits
3-D object-based coding Instead of deriving 2-D motion and disparity for
performing MCP and DCP, 3-D structure and motion parameters are estimated from the stereo or multiple views
The structure, motion, and surface texture of each object are coded, instead of individual image frames
At the decoder, desired views are synthesized Advantages
accurate 3-D estimation with the 3-D info derived from the stereo pair, one can
generate any intermediate view the coded 3-D info enables manipulation of the imaged
object and scene Wire-framed object, nodal positions, nodal
displacement vectors, segmentation map, I3D
Stereo Sequence Coding (5)
3-D model-based coding It is very difficult to derive the 3-D
structure of the objects in a scene automatically
Building a generic model for each potential object
Only a few objects are in the scene ex. Teleconferencing applications
Pre-designed generic face and body models can be used