robot\machine vision

$: Robot\Machine Vision$
Robot\Machine Vision

Cherevatsky Boris

What is a Computer vision ? Automatic understanding of images and videos by a computer (which could be plugged on the robot or standalone).

1.Computing properties of the 3D world out of visual data.

2. Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation)

Some applications:

Real-time stereo Structure from motion

NASA Mars Rover

Pollefeys et al.

Multi-view stereo forcommunity photo collections

Goesele et al.

Some applications:sky

water

Ferris wheel

amusement park

Cedar Point

12 E

tree

tree

tree

carouseldeck

people waiting in line

ride

ride

ride

umbrellas

pedestrians

maxair

bench

tree

Lake Erie

people sitting on ride

ObjectsActivitiesScenesLocationsText / writingFacesGesturesMotionsEmotions…

The Wicked Twister

3D Reconstruction:

Given many images of a certain scene we can use computer vision algorithms to reconstruct the 3D model.

Connection to other disciplines :

Mathematics

Algorithms

Image processing

Artificial intelligence

GraphicsMachine learning

Computer vision

Robotics

, : , , 0, 255I x y a b c d

I(176,201) = 164 I(194,203) has value 37

width 520j=1

500 height

i=1

Intensity : [0,255]

Image representation on Computer:

R G B

Color images, RGB color space :

Image formation – Pinhole Camera:

• Pinhole camera is a simple model to approximate imaging process, perspective projection.

If we treat pinhole as a point, only one ray from any given point can enter the camera.

Virtual image

pinhole

Image plane

• Far away objects appear smaller

Perspective Projection

Perspective Projection

Mathematical Equations

Perspective Projection & Calibration

3 1 3 3 3 4 4 4 4 1

3

3

3

1

4

4

4

3

:

p K Pers Hom p

Where

a point in pixel coordinates

intrinsic parameters matrix from image to pixel coordinates

perspective projection

Homogeneous trnsformation fromw

K

orld frametoCa

s

p

Hom

Per

mera frame

Intrinsic parameters: from idealized world coordinates to pixel values

xu f

z

yv f

z

Perspective projection

W. Freeman

Intrinsic parameters

xu

z

yv

z

But “pixels” are in some arbitrary spatial units

W. Freeman


xu

z

yv

z

Maybe pixels are not square

W. Freeman


0

0

xu u

z

yv v

z

We don’t know the origin of our camera pixel coordinates

W. Freeman


0

0

cot( )

sin( )

x yu u

z z

yv v

z

May be skew between camera pixel axes

v

u

v

u

vuvuu

vv

)cot()cos(

)sin(

W. Freeman

K Cp p

Intrinsic parameters, homogeneous coordinates

0

0

cot( )

sin( )

x yu u

z z

yv v

z

0

0

cot( )0

0 0sin( )

1 00 0 1 1

xuu

yv v

z

Using homogenous coordinates,we can write this as:

or:

In camera-based coords

In pixels

W. Freeman

Extrinsic parameters: translation and rotation of camera frame

tpRp CW

WCW

C Non-homogeneous

coordinates

Homogeneous coordinates

ptRp WC

WCW

C

1000|

|

W. Freeman

Combining extrinsic and intrinsic calibration parameters, in homogeneous coordinates

Forsyth&Ponce

ptRKp WCW

CW

pp CK

pMp W

Intrinsic

Extrinsic

ptRp WC

WCW

C

1000|

|

World coordinatesCamera coordinates

pixels

0 0 0 1

W. Freeman

25/33 עיבוד תמונות ואותות במחשב

Edge Detection - גילוי שפות

מפת שפות של התמונה



. f(x,y) נתייחס לתמונה כאל פונקציה רציפה•הגרדיאנט של פונקציה זו:•

כיוון הגרדיאנט מציין את הכיוון שבו רמות האפור משתנות •באופן מכסימלי. גודל הגרדיאנט הוא ערך השיפוע

המכסימלי.22

yf

xf

f

yf

xf

f ,

xfyf

f arctan


הגרדיאנט - דוגמא

fxf

yf


הגרדיאנט – דוגמא – המשך

>> i = double(imread('cameraman.tif'));>> gradFilt = [-1 0 1 ; -2 0 2 ; -1 0 1]/2;>> grad_x = imfilter(i , gradFilt , 'same' , 'replicate');>> grad_y = imfilter(i , gradFilt' , 'same' , 'replicate');>> [x,y] = meshgrid([1:size(i,2)] , [1:size(i,1)]);>> figure; imshow(i , []); hold on; >> quiver(x , y , grad_x , grad_y , 3 , 'm' , 'LineWidth' , 1);


הגרדיאנט – דוגמא נוספתfrice.png


קירוב הגרדיאנט של התמונה

• -ו x על מנת לחשב את הגרדיאנט יש צורך לחשב נגזרת בכיווןy:

10-1

10-1

10-1

-1-1-1

000

111

10-1

20-2

10-1

-1-2-1

000

121

prewitt sobel

מסנן לחשוב

xנגזרת בכיוון

מסנן לחשוב

xנגזרת בכיוון

מסנן לחשוב

yנגזרת בכיוון

מסנן לחשוב

yנגזרת בכיוון



f


?מהו ערך טוב לבחירת הסףT = 100 T = 70 T =40

T = 20 T = 10 T = 2


Canny גילוי שפות ע"י• :Canny :אלטרנטיבה

– E=edges(I,’canny’)

• :עקרונות בוחרים אך ורק נקודות –

שהן "מקסימום מקומי" בעוצמת הגרדיאנט

בוחרים גרדיאנטים חלשים –רק אם הם מחוברים לגרדיאנטים חזקים

robot\machine vision

Documents

prewittsobel x x y y

camera pixel axesw

camera pixel coordinatesw

image formation pinhole

rotation of camera framew

computer visionintensity

computer vision algorithms

freemanextrinsic parameters