computer vision 4
TRANSCRIPT
-
8/13/2019 Computer Vision 4
1/32
Copyright 2001. Howie Choset,Renata Melamud, Al Costa,
Vincent Lee-Shue, Sean Piper,
Ryan de Jonckheere. All RightsReserved
Computer Vision
-
8/13/2019 Computer Vision 4
2/32
2008 Howie Choset
Projection
P : R 3 -> R 2 or P : R 3 -> Z 2
Recover third dimension or justinfer stuff
-
8/13/2019 Computer Vision 4
3/32
2008 Howie Choset
ImagesDiscrete representation of a continuous function
Pixel: Picture Element cell of constant color in a digital image An image is a two dimensional array of pixelsPixel: numeric value representing a uniform portion of an image
Grayscale All pixels represent the intensity of light in an image, be it red,green, blue, or another color
Like holding a piece of transparent colored plastic over your eyesIntensity of light in a pixel is stored as a number, generally 0..255
inclusiveColorBinary
Three grayscale images layered on top of eachother with each layerindicating the intensity of a specific color light, generally red, green,and blue (RGB)Third dimension in a digital image
-
8/13/2019 Computer Vision 4
4/32
2008 Howie Choset
ImagesBinary images: Two color image
Pixel is only one byte of informationIndicates if the intensity of color is above or below some
nominal valueThresholding
ResolutionNumber of pixels across in horizontalNumber of pixels in the verticalNumber of layers used for color
Often measured in bits per pixel (bpp) where each color uses 8bits of data
Ex: 640x480x24bpp
-
8/13/2019 Computer Vision 4
5/32
2008 Howie Choset
OpticsFocal length
Length f of projection through lens onimage plane
InversionProjection on image plane is inverted
f
Image planeObject
-
8/13/2019 Computer Vision 4
6/32
2008 Howie Choset
Projection on the image planeSize of an image on the image plane is inversely proportional tothe distance from the focal point
h h
f
d
f h
d h
Focal point
h
h fd
f h
d h
Focal point
By conceptually moving the image plane, we can eliminate thenegative sign
Image plane
Image plane
-
8/13/2019 Computer Vision 4
7/32 2008 Howie Choset
Move to three dimensionsSimilarity holds when three dimensions areconsidered
x
y
z
Image plane
f
(x,y,z)
(x,y,z)
z z
y y
f x
x x
-
8/13/2019 Computer Vision 4
8/32 2008 Howie Choset
Perspective1 Point Perspective
Using similar triangles, it is possible to determine
the relative sizes of objects in an imageGiven a calibrated camera (predetermine amathematical relationship between size on theimage plane and the actual object)
f
Image planeObject
-
8/13/2019 Computer Vision 4
9/32 2008 Howie Choset
Preliminary Math:
TrigonometryPythagorean Theorem
a2 + b 2 = c 2
Law of sines
Law of cosinescba
sinsinsin
cos2222 abbac
-
8/13/2019 Computer Vision 4
10/32
-
8/13/2019 Computer Vision 4
11/32 2008 Howie Choset
Grayscale vs. Binary imageGrayscale Binary threshold
-
8/13/2019 Computer Vision 4
12/32 2008 Howie Choset
ThresholdingPurposeTrying to find areas of high color intensityHighlights locations of different features of the image (noticeMonas eyes) Image compression, use fewer bits to encode a pixel
How doneDecide on a value Scan every pixel in the image
If it is greater than , make it 255If it is less than , make it 0
Picking a good Often 128 is a good value to start withUse a histogram to determine values based on color frequencyfeatures
-
8/13/2019 Computer Vision 4
13/32 2008 Howie Choset
HistogramMeasure the number of pixels of differentvalues in an image.
Yields information such as the brightness ofan image, important color features,possibilities of color elimination forcompression
ThresholdingMake pixels above a value one color and valuesbelow that value a different colorBinary threshold often used to transform agrayscale image into black and white
Also usable for compression and feature extraction
-
8/13/2019 Computer Vision 4
14/32 2008 Howie Choset
Monas Histogram
0 255
-
8/13/2019 Computer Vision 4
15/32 2008 Howie Choset
ConnectivityTwo conventions on considering two pixelsnext to each other
To eliminate the ambiguity, we could define
the shape of a pixel to be a hexagon
8 point connectivity All pixels sharing a sideor corner are consideredadjacent
4 point connectivityOnly pixels sharing a sideare considered adjacent
-
8/13/2019 Computer Vision 4
16/32 2008 Howie Choset
Object location -
SegmentationOne method of locating an object is throughthe use of a wave front
Wavefront Assume a binary image with values of 0 or 11. Choose 1 st pixel with value 1, make it a 22. For each neighbor, if it is also a 1, make it a 2as well3. Repeat step two for each neighbor until thereare no neighbors with value 14. All pixels with a value 2 are are a continuousobject
-
8/13/2019 Computer Vision 4
17/32 2008 Howie Choset
CentroidsUse the region filled image from aboveCompute the area of the region
Number of pixels with the same number value (n)
Sum all of the x coords with the same pixel value. Do thesame for y coordsDivide each sum by n and the resulting x, y coord is thecentroid
-
8/13/2019 Computer Vision 4
18/32 2008 Howie Choset
Edge detectionScanline: one row of pixels in an imageTake the first derivative of a scanline
The derivative becomes nonzero when an
edge (pixels change values) is encountered
-
8/13/2019 Computer Vision 4
19/32 2008 Howie Choset
Implementing 1 st derivative
edge detection digitallyDerivative is defined asWith a scan line, the run (x c) is 1, and the
rise (f(x) f(c)) is B[m+1] B[m]This becomeswhere I is the resulting image of edges
This is really just a dot product of the vector[-1 1] repeated each pixel in the resultingimage
c xc f x f
c x )()(
lim
][1]1[1][ m Bm Bm I
11]1[][][ m Bm Bm I
-
8/13/2019 Computer Vision 4
20/32 2008 Howie Choset
More Math: ConvolutionThis operation of moving a mask acrossan image has a name, calledconvolutionIn order to mathematically apply a filterto a signal, we must use convolution
If you know Laplace transforms, this is amultiplication in the Laplace domain
-
8/13/2019 Computer Vision 4
21/32 2008 Howie Choset
Convolution: Analogd t h xt y )()()(
Given a symmetric h (common in image processing), simplifies to
d h xt y )()()(
h(t) = [-1 1]
Move across the signal x(possibly a scanline in animage)
-
8/13/2019 Computer Vision 4
22/32 2008 Howie Choset
Convolution: Digitalk
k hk n xn y ][][][
More useful in image processing on a digital computer
x[n] is a pixel in an image, y[n] is the resulting pixel0 2 2 0 1 1 3 0 1 1
40 2 2 0 1 1 3
0 1 1
0 2 2 0 1 1 3
0 1 14 2
1.
2.
-
8/13/2019 Computer Vision 4
23/32
-
8/13/2019 Computer Vision 4
24/32
2008 Howie Choset
Convolution: Two-dimensionalRotate your mask 180 degrees about the origin (if you were
doing correct convolution, but since we are doing the otherconvolution, you can skip this step.Do the same dot product operation, this time using matricesinstead of vectorsRepeat the dot product for every pixel in the resulting imageIn the boundary case around the edges of the image there aretwo options
extend the original image out using the pixel values at the edgeMake the resulting image y smaller than the original and dontcompute pixels where the mask would extend beyond the edge ofthe original
0 0
),(),(),( 0000m n
nmhnnmm xnm y
-
8/13/2019 Computer Vision 4
25/32
2008 Howie Choset
Convolution: Old and New Analog
Digital
Two dimensional, digital
0 0
),(),(),( 0000m n
nnmmhnm xnm y
d t h xt y )()()(
k
k nhk xn y ][][][
d ht xt y )()()(
k
k hk n xn y ][][][
0 0
),(),(),( 0000m n
nmhnnmm xnm y
-
8/13/2019 Computer Vision 4
26/32
2008 Howie Choset
Filters, Masks, TransformsEdge detection
Wide masks
SmoothingObject detection
-
8/13/2019 Computer Vision 4
27/32
-
8/13/2019 Computer Vision 4
28/32
2008 Howie Choset
Wide MasksWider masks lead to uncertainty about location of edgeCan detect more gradual edges
Convolution with Gaussian distribution
-
8/13/2019 Computer Vision 4
29/32
2008 Howie Choset
Wide MasksWider masks lead to uncertainty about location of edgeCan detect more gradual edges
Convolution with Gaussian distributionEdge detection on a signal thatwas convolved with a Gaussian
-
8/13/2019 Computer Vision 4
30/32
-
8/13/2019 Computer Vision 4
31/32
2008 Howie Choset
Cat Video
-
8/13/2019 Computer Vision 4
32/32
Stereo VisionWay of calculating depth from two dimensionalimages using two cameras
d and y are unknowns, and can be determinedprocessing and c is known
Camera 1
Camera 2
d
y
c
d
ycd
tan
tan
tan
tantantan
d
c y