scale invariant feature transform (sift)

24
Scale Invariant Scale Invariant Feature Transform Feature Transform (SIFT) (SIFT)

Upload: vincent-petty

Post on 30-Dec-2015

69 views

Category:

Documents


2 download

DESCRIPTION

Scale Invariant Feature Transform (SIFT). Outline. What is SIFT Algorithm overview Object Detection Summary. Overview. 1999 Generates image features, “keypoints” invariant to image scaling and rotation partially invariant to change in illumination and 3D camera viewpoint - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Scale Invariant Feature Transform (SIFT)

Scale Invariant Feature Scale Invariant Feature Transform (SIFT)Transform (SIFT)

Page 2: Scale Invariant Feature Transform (SIFT)

OutlineOutline

What is SIFTWhat is SIFT

Algorithm overviewAlgorithm overview

Object DetectionObject Detection

SummarySummary

Page 3: Scale Invariant Feature Transform (SIFT)

OverviewOverview

19991999

Generates image features, “keypoints”Generates image features, “keypoints”– invariant to image scaling and rotation– partially invariant to change in illumination and

3D camera viewpoint– many can be extracted from typical images– highly distinctive

Page 4: Scale Invariant Feature Transform (SIFT)

Algorithm overviewAlgorithm overview

Scale-space extrema detectionScale-space extrema detection– Uses difference-of-Gaussian functionUses difference-of-Gaussian function

Keypoint localizationKeypoint localization– Sub-pixel location and scale fit to a modelSub-pixel location and scale fit to a model

Orientation assignmentOrientation assignment– 1 or more for each keypoint1 or more for each keypoint

Keypoint descriptorKeypoint descriptor– Created from local image gradientsCreated from local image gradients

Page 5: Scale Invariant Feature Transform (SIFT)

Scale spaceScale space

Definition: Definition:

wherewhere

),(),,(),,( yxIyxGyxL 222 2/)(

22

1),,(

yxeyxG

Page 6: Scale Invariant Feature Transform (SIFT)

Scale spaceScale space

Keypoints are detected using scale-space Keypoints are detected using scale-space extrema in difference-of-Gaussian function extrema in difference-of-Gaussian function DD

DD definition: definition:

Efficient to computeEfficient to compute

),()),,(),,((),,( yxIyxGkyxGyxD

),,(),,( yxLkyxL

Page 7: Scale Invariant Feature Transform (SIFT)

Relationship of Relationship of DD to to

Close approximation to scale-Close approximation to scale-normalized Laplacian of Gaussian,normalized Laplacian of Gaussian,

Diffusion equation:

Approximate ∂G/∂σ:

– giving,

When D has scales differing by a constant factor it already incorporates the σ2 scale normalization required for scale-invariance

G22

GG 2

k

yxGkyxGG ),,(),,(

GkyxGkyxG 22)1(),,(),,(

Gk

yxGkyxG 2),,(),,(

G22

Page 8: Scale Invariant Feature Transform (SIFT)

Scale space constructionScale space construction

2k2σ

2kσ

σ

2kσ

σ

Page 9: Scale Invariant Feature Transform (SIFT)

Scale space imagesScale space images

first octave

second octave

third octave

fourth octave

Page 10: Scale Invariant Feature Transform (SIFT)

Difference-of-Gaussian imagesDifference-of-Gaussian images

first octave

second octave

third octave

fourth octave

Page 11: Scale Invariant Feature Transform (SIFT)

Frequency of samplingFrequency of sampling

There is no minimumThere is no minimum

Best frequency determined experimentallyBest frequency determined experimentally

Page 12: Scale Invariant Feature Transform (SIFT)

Prior smoothing for each octavePrior smoothing for each octave

Increasing Increasing σσ increases robustness, but costs increases robustness, but costs

σσ = 1.6 a good tradeoff = 1.6 a good tradeoff

Doubling the image initially increases Doubling the image initially increases number of keypointsnumber of keypoints

Page 13: Scale Invariant Feature Transform (SIFT)

Finding extremaFinding extrema

Sample point is selected only if it is a Sample point is selected only if it is a minimum or a maximum of these pointsminimum or a maximum of these points

DoG scale spaceExtrema in this image

Page 14: Scale Invariant Feature Transform (SIFT)

LocalizationLocalization

3D quadratic function is fit to the local sample 3D quadratic function is fit to the local sample pointspoints

Start with Taylor expansion with sample point Start with Taylor expansion with sample point as the originas the origin– wherewhere

Take the derivative with respect to Take the derivative with respect to XX, and set , and set it to 0, givingit to 0, giving

is the location of the keypointis the location of the keypoint

This is a 3x3 linear systemThis is a 3x3 linear system

2

2

2

1)(

DDDD T

T

Tyx ),,(

DD2

12

ˆ

XX

D

X

D ˆ02

2

Page 15: Scale Invariant Feature Transform (SIFT)

LocalizationLocalization

Derivatives approximated by finite Derivatives approximated by finite differences,differences,– example:example:

If If XX is > 0.5 in any dimension, process is > 0.5 in any dimension, process repeatedrepeated

x

Dy

D

D

x

y

x

D

yx

D

x

Dyx

D

y

D

y

Dx

D

y

DD

2

222

2

2

22

22

2

2

4

)()(

1

2

2

,11

,11

,11

,11

2

,1

,,1

2

2

,1

,1

jik

jik

jik

jik

jik

jik

jik

jik

jik

DDDD

y

D

DDDD

DDD

Page 16: Scale Invariant Feature Transform (SIFT)

FilteringFiltering

Contrast (use prev. equation):Contrast (use prev. equation):– If If | D(X) || D(X) | < 0.03, throw it out < 0.03, throw it out

Edge-iness:Edge-iness:– Use ratio of principal curvatures to throw out poorly Use ratio of principal curvatures to throw out poorly

defined peaksdefined peaks– Curvatures come from Hessian:Curvatures come from Hessian:– Ratio of Ratio of Trace(H)Trace(H)22 and and Determinant(H)Determinant(H)

– If ratio > If ratio > (r+1)(r+1)22/(r)/(r), throw it out (SIFT uses r=10), throw it out (SIFT uses r=10)

XD

DDT

ˆ2

1)ˆ(

yyxy

xyxx

DD

DDH

2)()(

)(

xyyyxx

yyxx

DDDHDet

DDHTr

Page 17: Scale Invariant Feature Transform (SIFT)

Orientation assignmentOrientation assignment

Descriptor computed relative to keypoint’s Descriptor computed relative to keypoint’s orientation achieves rotation invarianceorientation achieves rotation invariance

Precomputed along with mag. for all levels Precomputed along with mag. for all levels (useful in descriptor computation)(useful in descriptor computation)

Multiple orientations assigned to keypoints Multiple orientations assigned to keypoints from an orientation histogramfrom an orientation histogram– Significantly improve stability of matchingSignificantly improve stability of matching

))),1(),1(/())1,()1,(((2tan),(

))1,()1,(()),1(),1((),( 22

yxLyxLyxLyxLayx

yxLyxLyxLyxLyxm

Page 18: Scale Invariant Feature Transform (SIFT)

Keypoint imagesKeypoint images

Page 19: Scale Invariant Feature Transform (SIFT)

DescriptorDescriptor

Descriptor has 3 dimensions Descriptor has 3 dimensions (x,y,(x,y,θθ))

Orientation histogram of gradient magnitudesOrientation histogram of gradient magnitudes

Position and orientation of each gradient Position and orientation of each gradient sample rotated relative to keypoint orientationsample rotated relative to keypoint orientation

Page 20: Scale Invariant Feature Transform (SIFT)

DescriptorDescriptor

Weight magnitude of each sample point by Weight magnitude of each sample point by Gaussian weighting functionGaussian weighting function

Distribute each sample to adjacent bins by Distribute each sample to adjacent bins by trilinear interpolation (avoids boundary effects)trilinear interpolation (avoids boundary effects)

Page 21: Scale Invariant Feature Transform (SIFT)

DescriptorDescriptorBest results achieved with 4x4x8 = 128 Best results achieved with 4x4x8 = 128 descriptor sizedescriptor size

Normalize to unit lengthNormalize to unit length– Reduces effect of illumination changeReduces effect of illumination change

Cap each element to 0.2, normalize againCap each element to 0.2, normalize again– Reduces non-linear illumination changesReduces non-linear illumination changes– 0.2 determined experimentally0.2 determined experimentally

Page 22: Scale Invariant Feature Transform (SIFT)

Object DetectionObject Detection

Create a database Create a database of keypoints from of keypoints from training imagestraining images

Match keypoints to Match keypoints to a databasea database– Nearest neighbor Nearest neighbor

searchsearch

Page 23: Scale Invariant Feature Transform (SIFT)

PCA-SIFTPCA-SIFT

Different descriptor (same keypoints)Different descriptor (same keypoints)

Apply PCA to the gradient patchApply PCA to the gradient patch

Descriptor size is 20 (instead of 128)Descriptor size is 20 (instead of 128)

More robust, fasterMore robust, faster

Page 24: Scale Invariant Feature Transform (SIFT)

SummarySummary

Scale spaceScale space

Difference-of-GaussianDifference-of-Gaussian

LocalizationLocalization

FilteringFiltering

Orientation assignmentOrientation assignment

Descriptor, 128 elementsDescriptor, 128 elements