olga/courses/winter2017/cs4442_9542b/l11-cv-stereo.pdfanaglyph images •encodes left and right...
TRANSCRIPT
![Page 1: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/1.jpg)
Some slides are from S. Seitz, S. Narasimhan, K. Grauman
CS4442/9542b Artificial Intelligence II
prof. Olga Veksler
Lecture 11 Computer Vision
Stereo
![Page 2: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/2.jpg)
Outline • Cues for 3D reconstruction • Stereo Cues • Stereo Reconstruction
1) camera calibration and rectification • an easier, mostly solved problem
2) stereo correspondence • a harder problem
![Page 3: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/3.jpg)
2D Images • Depth is inherently ambiguous from a single view
P
X ?
Y ?
Z ?
![Page 4: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/4.jpg)
2D Images • World is 3D • In 2D images, depth (the third coordinate) is largely lost
• includes human retina
![Page 5: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/5.jpg)
Street Pavement Art • Viewed from the “right” side
![Page 6: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/6.jpg)
Street Pavement Art • Viewed from the “wrong” side
![Page 7: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/7.jpg)
Babies and Animals Perceive Depth
The Visual Cliff, by William Vandivert, 1960
• Yet we perceive the world in 3D
![Page 8: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/8.jpg)
3D Shape from Images • What image cues provide 3D information? • Cues from a single image • Cues from multiple images
• Motion cues • Stereo cues
• Can we use these cues in a computer vision system?
![Page 9: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/9.jpg)
Single Image 3D Cues: Shading
Merle Norman Cosmetics, Los Angeles
• Pixels covered by shadow are perceived to be further away
![Page 10: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/10.jpg)
Single Image 3D Cues: Linear Perspective • The further away are parallel lines, the closer they come together
![Page 11: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/11.jpg)
Single Image 3D Cues: Relative Size • If objects have the same size, those further away appear smaller
![Page 12: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/12.jpg)
Single Image 3D Cues: Texture • Further away texture appears finer (smaller scale)
![Page 13: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/13.jpg)
Single Image 3D Cues: Known Size • Ducks are smaller than elephants, duck is closer
![Page 14: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/14.jpg)
Illusions: Linear Perspective + Relative Size
![Page 15: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/15.jpg)
Illusions: Linear Perspective + Relative Size
![Page 16: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/16.jpg)
Illusions: Ames Room
![Page 17: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/17.jpg)
Cues from Multiple Image: Motion Parallax
http://psych.hanover.edu/KRANTZ/MotionParallax/MotionParallax.html
• Closer objects appear to move more than further away objects
![Page 18: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/18.jpg)
3D Shape from X • X = shading, texture, motion, ... • We will focus on stereo
• depth perception from two stereo images
![Page 19: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/19.jpg)
Why Two Eyes? Cylopes?
![Page 20: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/20.jpg)
Why Two Eyes? • Charles Wheatstone first explained stereopsis in 1838
left image
(x,y)
3D Scene
right eye left eye
right image
(x-d,y)
![Page 21: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/21.jpg)
Why Two Eyes? • Disparity d is the difference in x coordinates of corresponding points
left image
(x,y)
3D Scene
right eye left eye
right image
(x-d,y)
![Page 22: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/22.jpg)
Stereoscopes • Wheatstone invented the first stereoscope
![Page 23: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/23.jpg)
Anaglyph Images • Encodes left and right image
into a single picture • left eye image is transferred
to the red channel • right eye image to the
green+blue = cyan channel
• Red filter lets through only the left image
• Cyan filter lets through only theright eye image
• Brain fuses into 3D • Similar technology for 3D
movies • Works for most of us
![Page 24: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/24.jpg)
What is Needed for Stereopsis? • Need monocular cues for stereopsis? Need object cues?
Answered by Julesz in 1960 • Image with no monocular cues and no recognizable
objects: random dots
![Page 25: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/25.jpg)
Need Object Recognition for Stereopsis? • Answered by Julesz in 1960 • Make a copy of it
![Page 26: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/26.jpg)
Need Object Recognition for Stereopsis? • Answered by Julesz in 1960 • Select a square
![Page 27: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/27.jpg)
Need Object Recognition for Stereopsis? • Answered by Julesz in 1960 • Copy square the right image, shifting by d to the left
• random dot stereogram
![Page 28: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/28.jpg)
Need Object Recognition for Stereopsis? • Answered by Julesz in 1960 • Random dot stereogram • Humans perceive square floating in front of background
![Page 29: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/29.jpg)
3D Shape from Stereo • Use two cameras instead of two eyes
left image
(x,y)
3D Scene
right camera left camera
right image
(x-d,y)
![Page 30: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/30.jpg)
Stereo System
3D scene point
optical center left camera
optical center right camera
• Unlike eyes, usually stereo cameras are not on the same plane • better numerical stability
![Page 31: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/31.jpg)
Stereo System: Triangulation
• Depth by triangulation • given two corresponding points in the left and right image • cast the rays through the optical camera centers • ray intersection is the corresponding 3D world point P • depth of P is based on camera positions and parameters
• Triangulation ideas can be traced to ancient Greece
3D scene point P
optical center left camera
optical center right camera
document from 1533
![Page 32: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/32.jpg)
What is needed for Triangulation
1. Distance between cameras, camera focal length • Solved through camera calibration, essentially a solved problem • We will not talk about it • Code available on the web
• OpenCV http://www.intel.com/research/mrl/research/opencv/ • Matlab, J. Bouget http://www.vision.caltech.edu/bouguetj/calib_doc/index.html • Zhengyou Zhang http://research.microsoft.com/~zhang/Calib/
2. Pairs of corresponding pixels in left and right images • Called stereo correspondence problem, still much researched
![Page 33: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/33.jpg)
Formula: Depth from Disparity
f xl xr
baseline B
Cl Cr
P = (X,Y,Z)
Z
• Top down view on geometry (slice through XZ plane) • from camera calibration, know the distance between camera optical
centers called baseline B, and camera focal length f
left optical center
right optical center
X
left image point
right image point
f
Z
![Page 34: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/34.jpg)
Formula: Depth from Disparity • Height to base ratio of triangle Cl P Cr :
f xl xr
baseline B
Cl Cr
P = (X,Y,Z)
Z
left optical center
right optical center
X
left image point
right image point
f
Z
Z B
![Page 35: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/35.jpg)
Formula: Depth from Disparity
f xl xr
baseline B
Cl Cr
P = (X,Y,Z)
Z
left optical center
right optical center
X
left image point
right image point
f
Z
• xl is positive, xr is negative
Z - f B - xl + xr
• Height to base ratio of triangle xl P xr :
![Page 36: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/36.jpg)
Formula: Depth from Disparity • Cl P Cr and ∆ xl P xr are similar:
f xl xr
baseline B
Cl Cr
P = (X,Y,Z)
Z
left optical center
right optical center
X
left image point
right image point
f
Z
Z B
Z - f B - xl + xr
=
![Page 37: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/37.jpg)
Formula: Depth from Disparity • Rewriting:
f xl xr
baseline B
Cl Cr
P = (X,Y,Z)
Z
left optical center
right optical center
X
left image point
right image point
f
Z
Z B⋅ f xl - xr
=
• xl - xr is the disparity
![Page 38: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/38.jpg)
Stereo Correspondence: Epipolar Lines • Which pairs of pixels correspond to the same scene element ?
• Epipolar constraint • Given a left image pixel, the corresponding pixel in the right image must
lie on a line called the epipolar line • reduces correspondence to 1D search along conjugate epipolar lines • demo: http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html
optical center left camera
optical center right camera
![Page 39: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/39.jpg)
Stereo Rectification • Epipolar lines can be computed from camera calibration
• Usually they are not horizontal • Can rectify stereo pair to make epipolar lines horizontal
![Page 40: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/40.jpg)
Stereo Correspondence
• From now on assume stereo pair is rectified • How to solve the correspondence problem? • Corresponding pixels should be similar in intensity
• or color, or something else
left image right image
(x,y) (x-d,y)
![Page 41: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/41.jpg)
Difficulties in Stereo Correspondence • Image noise
• corresponding pixels have similar, but not exactly the same intensities
left image patch right image patch
90
• Matching each pixel individually is unreliable
98 90
![Page 42: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/42.jpg)
Difficulties in Stereo Correspondence • Especially in regions with (almost) constant intensity
? ? ?
• Matching each pixel individually is unreliable
![Page 43: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/43.jpg)
Window Matching Correspondence
• Use a window (patch) of pixels • more likely to have enough intensity variation to form a distinguishable
pattern • also more robust to noise
![Page 44: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/44.jpg)
Window Matching Correspondence
• Use a window (patch) of pixels • more likely to have enough intensity variation to form a distinguishable
pattern • also more robust to noise
![Page 45: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/45.jpg)
Window Matching: Basic Algorithm
• for each epipolar line • for each pixel p on the left line
• compare window around p with same window shifted to many right window locations on corresponding epipolar line
• pick location corresponding to the best matching window
![Page 46: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/46.jpg)
Which Locations to Try?
• Disparity cannot be negative • Maximum possible disparity is limited by the camera setup
• assume we know maxDisp • Disparity can range from 0 to maxDisp
• consider only (x,y), (x-1,y),…(x-maxDisp,y) in the right image
(x,y) (x,y) (x-1,y) (x-maxDisp,y)
![Page 47: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/47.jpg)
Window Matching Cost
• How to define the best matching window? • Define window cost
• sum of squared differences (SSD) • or sum of absolute differences (SAD) • many other possibilities
• Pick window of best (smallest) cost
![Page 48: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/48.jpg)
SSD Window Cost
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
( ) ( ) ( )( ) ( ) ( )( ) ( ) ( ) 124546465564656
44774747474446464446
222
222
222
=−+−+−+−+−+−+−+−+−
![Page 49: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/49.jpg)
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
( ) ( ) ( )( ) ( ) ( )( ) ( ) ( ) 124546465564656
44774747474446464446
222
222
222
=−+−+−+−+−+−+−+−+−
• This shift corresponds to disparity 0
Algorithm with SSD Window Cost
![Page 50: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/50.jpg)
( ) ( ) ( )( ) ( ) ( )( ) ( ) ( ) 642554646565656
747747474764444464646
222
222
222
=−+−+−+−+−+−+−+−+−
Algorithm with SSD Window Cost
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
• This shift corresponds to disparity 1
![Page 51: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/51.jpg)
Algorithm with SSD Window Cost
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
• This shift corresponds to disparity 2
( ) ( ) ( )( ) ( ) ( )( ) ( ) ( ) 8464656565856
474747474747444446464846
222
222
222
=−+−+−+−+−+−+−+−+−
![Page 52: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/52.jpg)
Algorithm with SSD Window Cost 3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
• Best SSD window cost is 8 at disparity 2 • Red pixel is assigned disparity 2 • Repeat this for all image pixels
12454 6425 8
![Page 53: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/53.jpg)
Correspondence with SSD Matching
disparity
SSD
cost
• Unique best cost location
![Page 54: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/54.jpg)
Compare to One Pixel “Window”
disparity
SSD
cost
• No unique best cost location
![Page 55: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/55.jpg)
SAD Window Cost
1 1 10
1 1 10
1 1 19
• SSD is fragile to outliers 1 1 10
1 1 10
1 1 99
SSD cost = 802 = 6400
1 1 10
1 1 10
1 1 19
31 31 31
31 31 31
31 31 29
SSD cost = 6384
• SAD (Sum of Absolute Differences) is more robust
1 1 10
1 1 10
1 1 19
1 1 10
1 1 10
1 1 99
SAD cost = 80
1 1 10
1 1 10
1 1 19
31 31 31
31 31 31
31 31 29
SAD cost = 232
best
best
![Page 56: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/56.jpg)
Window Matching Efficency • Suppose
• image has n pixels • matching window is 11 by 11
• Need 11⋅11 = 121 additions and multiplications to compute one window cost
• Multiply that by number of locations to check (maxDisp+1)
• Multiply that by n image pixels • 121 ⋅ n ⋅(maxDisp+1) • Tooooo sloooow
• gets worse for larger windows • Can get cost down to n ⋅(maxDisp+1) with integral images
![Page 57: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/57.jpg)
Speedups: Integral Image • Given image f(x,y), the integral image I(x,y) is the sum of values
in f(x,y) to the left and above (x,y), including (x,y)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y)
0 0 0 5 10
0 0 5 15 25
0 5 15 30 50
5 15 30 55 75
10 25 50 75 95
I(x,y)
• Example: I(2,2) = 0 + 0 + 0 + 0 + 0 + 5 + 0 + 5 + 5 = 15 • indexing starts at 0 in this example
![Page 58: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/58.jpg)
Speedups: Integral Image • Given image f(x,y), the integral image I(x,y) is the sum of values
in f(x,y) to the left and above (x,y), including (x,y)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y)
0 0 0 5 10
0 0 5 15 25
0 5 15 30 50
5 15 30 55 75
10 25 50 75 95
I(x,y)
• Example: I(4,1) = 0 + 0 + 0 + 5 + 5 + 0 + 0 +5 + 5 + 5 = 25
![Page 59: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/59.jpg)
• Suppose computed integral image up to location (x,y)
Efficiently Computing Integral Image
I(x,y) = f(x,y)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y) I(x,y)
+
![Page 60: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/60.jpg)
• Suppose computed integral image up to location (x,y)
Efficiently Computing Integral Image
I(x,y) = f(x,y) + I(x-1,y)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y) I(x,y)
+ + + + + + + + +
![Page 61: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/61.jpg)
• Suppose computed integral image up to location (x,y)
Efficiently Computing Integral Image
I(x,y) = f(x,y) + I(x-1,y) + I(x,y-1) 0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y) I(x,y)
+ + + + + + + + +
+ + +
+ + +
+ + +
![Page 62: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/62.jpg)
• Suppose computed integral image up to location (x,y)
Efficiently Computing Integral Image
I(x,y) = f(x,y) + I(x-1,y) + I(x,y-1) - I(x-1,y-1)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y) I(x,y)
+ + + + + + + + +
+ + +
+ + +
+ + +
_ _ _
_ _ _
![Page 63: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/63.jpg)
Integral Image: Order of Computation
I(x,y)
• Convenient order of computation 1. first row 2. first column 3. the rest in row-wise fashion
1 2 3 4 5
6
7
8
9
10 11 12 13
14 15 16 17
18 19 20 21
22 23 24 25
![Page 64: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/64.jpg)
Using Integral Image • After computed integral image, sum over any rectangular
window is computed with four operations • Top left corner (x1,y1) and bottom right corner (x2,y2)
I(x,y)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y)
I(x2,y2)
+ + + + + + + +
+ + + + + + + +
+ + + +
![Page 65: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/65.jpg)
Using Integral Image • After computed integral image, sum over any rectangular
window is computed with four operations • Top left corner (x1,y1) and bottom right corner (x2,y2)
I(x2,y2) - I(x1-1,y2)
I(x,y)
+ + + + + + + +
+ + + + + + + +
- - - -
+ + + + 0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y)
- - - -
![Page 66: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/66.jpg)
Using Integral Image • After computed integral image, sum over any rectangular
window is computed with four operations • Top left corner (x1,y1) and bottom right corner (x2,y2)
I(x2,y2) - I(x1-1,y2) - I(x2,y1-1)
- - - - 0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y) I(x,y)
+ + + + + + + +
+ + + + + + + +
- - - -
+ + + +
- - - - -
- - - - -
![Page 67: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/67.jpg)
Using Integral Image • After computed integral image, sum over any rectangular
window is computed with four operations • Top left corner (x1,y1) and bottom right corner (x2,y2)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y)
I(x2,y2) - I(x1-1,y2) - I(x2,y1-1) + I(x1-1,y1-1)
- - - -
I(x,y)
+ + + + + + + +
+ + + + + + + +
- - - -
+ + + +
- - - - - + + + + - - - - -
![Page 68: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/68.jpg)
Using Integral Image • After computed integral image, sum over any rectangular
window is computed with four operations • Top left corner (x1,y1) and bottom right corner (x2,y2)
I(x2,y2) - I(x1-1,y2) - I(x2,y1-1) + I(x1-1,y1-1)
0 0 0 5 10
0 0 5 15 25
0 5 15 30 50
5 15 30 55 75
10 25 50 75 95
I(x,y)
0 0 0 5 5
0 0 5 5 5
0 5 5 5 10
5 5 5 10 0
5 5 10 0 0
f(x,y)
• Example 5 + 5 +10 + 5 + 10 + 0 = 75 -15 - 25 + 0 = 35
![Page 69: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/69.jpg)
Inefficient Window Matching (SAD cost) • for each pixel p
• for every disparity d • compute cost between window around p in the left image
and the same window shifted by d in the right image • pick d corresponding to the best matching window
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
256 186 4
![Page 70: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/70.jpg)
Integral Image for Window Matching • For each disparity d need to compute window cost for all pixels,
eventually • For example, pick disparity d = 1
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
![Page 71: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/71.jpg)
Integral Image for Window Matching • Old inefficient algorithm:
• for each pixel p • for every disparity d
• compute cost between window around p in the left image and the same window shifted by d in the right image
• pick d corresponding to the best matching window
• New efficient algorithm: • for each disparity d
• for every pixel p • compute cost between window around p in the left image
and the same window shifted by d in the right image • pick d corresponding to the best matching window
use integral image
swap
![Page 72: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/72.jpg)
Integral Image for Window Matching • Suppose current disparity is d = 1
• Overlay left and right image at disparity 1 • Compute AD (absolute difference) between every overlaid
pair of pixels • Compute SAD in a window for every pixel
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
left image right image
![Page 73: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/73.jpg)
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
left image right image
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
2 1 0 2 2 2
3 3 3 0 2 4
39 0 0 43 0 1
39 0 2 38 5 2
40 0 0 40 2 2
51 0 10 41 0 1
1 0 3 3 1 1
AD image for disparity 1
Integral Image for Window Matching • current
disparity is d = 1
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
![Page 74: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/74.jpg)
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
left image right image
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
0 2 1 0 2 2 2
0 3 3 3 0 2 0
0 39 0 0 43 0 0
0 39 0 2 38 5 0
0 40 0 0 40 2 0
0 51 0 10 41 0 0
0 1 0 3 3 1 0
AD image for disparity 1
Integral Image for Window Matching • current
disparity is d = 1
• Pad AD image with zeros
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
![Page 75: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/75.jpg)
• current disparity is d = 1
Integral Image for Window Matching left image right image
AD image for disparity 1
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
0 2 1 0 2 2 2
0 3 3 3 0 2 0
0 39 0 0 43 0 0
0 39 0 2 38 5 0
0 40 0 0 40 2 0
0 51 0 10 41 0 0
0 1 0 3 3 1 0
![Page 76: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/76.jpg)
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
• current disparity is d = 1
left image right image
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
Integral Image for Window Matching
AD image for disparity 1
0 2 1 0 2 2 2
0 3 3 3 0 2 0
0 39 0 0 43 0 0
0 39 0 2 38 5 0
0 40 0 0 40 2 0
0 51 0 10 41 0 0
0 1 0 3 3 1 0
![Page 77: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/77.jpg)
3 5 4 4 2 4 2
7 4 1 4 4 2 6
46 46 46 3 6 6 7
48 46 44 6 4 9 7
47 47 47 7 4 2 4
58 56 46 5 6 6 7
3 4 4 1 4 3 2
3 5 4 4 2 4 2
7 4 1 4 4 2 6
2 7 46 46 46 6 7
5 9 46 46 44 9 7
4 7 47 47 47 2 4
4 7 56 56 46 6 7
3 4 4 1 4 3 2
• current disparity is d = 1
Integral Image for Window Matching left image right image
AD image for disparity 1
0 2 1 0 2 2 2
0 3 3 3 0 2 0
0 39 0 0 43 0 0
0 39 0 2 38 5 0
0 40 0 0 40 2 0
0 51 0 10 41 0 0
0 1 0 3 3 1 0
![Page 78: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/78.jpg)
• Current disparity is 1 • For each window pixel, have to
compute window sums in AD image • Apply integral image to AD image
AD image for disparity 1
Integral Image for Window Matching
0 2 1 0 2 2 2
0 3 3 3 0 2 0
0 39 0 0 43 0 0
0 39 0 2 38 5 0
0 40 0 0 40 2 0
0 51 0 10 41 0 0
0 1 0 3 3 1 0
![Page 79: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/79.jpg)
for every pixel p do bestDisparity[p] = 0 bestWindCost[p] = HUGE
for disparity d = 0, 1,…, maxD do overlay images at disparity d compute AD image for disparity d compute Integral image from AD image
for every pixel p do currentCost = window cost at pixel p, computed from integral image if currentCost < bestWindCost[p] bestWindCost[p] = currentCost bestDisparity[p] = d
return bestDisparity
Efficient Algorithm for Window Matching
2 1 0 2 2 2
3 3 3 0 4 0
39 0 0 43 1 0
39 0 2 38 2 0
40 0 0 40 2 0
51 0 10 41 0 0
1 0 3 3 1 0
AD image for disparity 1
![Page 80: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/80.jpg)
Effect of Window size
left image right image true disparities bright means larger disparity
3x3 window 7x7 window 15x15 window
![Page 81: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/81.jpg)
Effect of Window size: Low Texture Area
left image 0 5 10 150
50
100
150
200
250
3x3 7x7
15x15
disparity w
indo
w co
st
• windows of size 3x3 and 7x7 are too small to have a distinct pattern • no clearly best disparity
• window of size 15x15 is large enough to have a distinct pattern • 7 is clearly the best disparity
• window has to be large enough
![Page 82: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/82.jpg)
0 5 10 150
5
10
15
20
Effect of Window size: Near Discontinuities
left image
3x3 7x7
15x15
disparity
win
dow
cost
• central pixel (the one we are matching) is the lamp • windows of size 3x3 and 7x7 contain mostly the lamp • window of size 15x15 contains mostly the wall
• we match the wall instead of the lamp!
• window must be small enough to contain mostly the same object as the central pixel
![Page 83: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/83.jpg)
Effect of Window size • No single window size is ‘perfect’ for the image • Smaller window
• works better around object boundaries • noisy results in low texture areas
• Larger window • better results in low texture areas • does not preserve object boundaries well
• Adaptive window algorithms exist [Veksler’2001]
![Page 84: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/84.jpg)
Better Stereo Algorithms
State of the art method [Boykov, Veksler, Zabih, 2001]
ground truth
• Formulate stereo as energy minimization • Recall binary object/background segmentation problem
object
background
![Page 85: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/85.jpg)
Better Stereo Algorithms
ground truth
• Stereo is multi-label segmentation problem • region 0 = label 0 “likes” disparity 0 • region 1 = label 1 “likes” disparity 1 • … • region maxDisp = label maxDisp “likes” disparity maxDisp
disp 0 disp 1
disp 2
disp 4
![Page 86: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/86.jpg)
Stereo with Graph Cuts
• Energy Function • Data Term: assign each pixel disparity label it likes • Smoothness Term: count number of label (disparity)
discontinuities
AD 5 data term for label 5
AD 8 data term for label 8
AD 10 data term for label 10
AD 14 data term for label 14
• Solved with Graph Cuts: iteratively cuts out regions corresponding to disparities
• NP-hard with more than 2 labels, but computes a good approximation
![Page 87: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/87.jpg)
Stereo with Graph Cuts • Start with everything as label (disparity) 0
![Page 88: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/88.jpg)
Stereo with Graph Cuts • “Cut out” label (disparity) 1
![Page 89: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/89.jpg)
Stereo with Graph Cuts • “Cut out” label (disparity) 2
![Page 90: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/90.jpg)
Stereo with Graph Cuts • “Cut out” label (disparity) 3
![Page 91: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/91.jpg)
Stereo with Graph Cuts • “Cut out” label (disparity) 4
![Page 92: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/92.jpg)
Stereo with Graph Cuts • “Cut out” label (disparity) 5
![Page 93: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/93.jpg)
Stereo with Graph Cuts • “Cut out” label (disparity) 6
![Page 94: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/94.jpg)
Multiple Artificial Eyes • Two eyes better than one → three eyes better than two → four
eyes better than three → … → the more, the better
![Page 95: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/95.jpg)
Common Folk New that Already
![Page 96: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/96.jpg)
Stereo with Structured Light
• Project “structured” light patterns onto the object • Simplifies correspondence problem • Need one camera and one projector
camera
projector
![Page 97: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/97.jpg)
Stereo with Structured Light • Triangulate between camera and projector
![Page 98: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/98.jpg)
Kinect: Structured Infrared Light
http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/
![Page 99: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/99.jpg)
Laser Scanning
• Optical triangulation • Project a single stripe of laser light • Scan it across the surface of the object • This is a very precise version of structured light scanning
Digital Michelangelo Project Levoy et al.
http://graphics.stanford.edu/projects/mich/
![Page 100: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/100.jpg)
Laser Scanned Models
The Digital Michelangelo Project, Levoy et al.
![Page 101: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/101.jpg)
Laser Scanned Models
The Digital Michelangelo Project, Levoy et al.
![Page 102: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/102.jpg)
Numerous Applications
Nomad robot searches for meteorites in Antartica http://www.frc.ri.cmu.edu/projects/meteorobot/index.html
• Autonomous navigation
![Page 103: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/103.jpg)
Novel View Synthesis
input image (1 of 2) [Szeliski & Kang ‘95]
depth map 3D rendering
![Page 104: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/104.jpg)
Applications: Video View Interpolation http://research.microsoft.com/users/larryz/videoviewinterpolation.htm
![Page 105: olga/Courses/Winter2017/CS4442_9542b/L11-CV-stereo.pdfAnaglyph Images •Encodes left and right image into a single picture •left eye image is transferred to the red channel •right](https://reader033.vdocument.in/reader033/viewer/2022041609/5e36a332d43eae46c1405aeb/html5/thumbnails/105.jpg)
Stereo Correspondence • Steps:
• Calibrate cameras • Rectify images • Stereo correspondence • Apply depth/disparity formula
• Stereo correspondence is still heavily researched • The simple window matching algorithm we studied is
heavily used in practice due to speed and simplicity • Popular Benchmark
http://www.middlebury.edu/stereo