image mosaicing

IT Strategic Discussion - Overview

ImageMosaicingShiran Stan-Meleh

*http://www.ptgui.com/info/image_stitching.html

Why do we need it?

Panorama

360 ViewSatellite ImagesCompact Camera FOV = 50 x 35Human FOV = 200 x 135Panoramic Mosaic = 360 x 180How do we do it?Direct (appearance-based)Search for alignment where most pixels agreeFeature-basedFind a few matching features in both imagescompute transformation2 methods*Copied from Hagit Hel-Or pptHow do we do it?Manually Direct (appearance-based) methods

*http://www.marymount.fr/uploads/galleries/gallery402/images/002_matisse_project_gluing.jpgHow do we do it?Define an error metric to compare the imagesEx: Sum of squared differences (SSD).Define a search technique (simplest: full search)Pros:Simple algorithm, can work on complicated transformationGood for matching sequential frames in a videoCons: Need to manually estimate parametersCan be very slowDirect (appearance-based) methodsHow do we do it?Harris Corner detection - C. Harris & M. Stephens (1988) SIFT - David Lowe (1999)PCA-SIFT - Y. Ke & R. Sukthankar (2004)SURF - Bay & Tuytelaars (2006)GLOH - Mikolajczyk & Schmid (2005)HOG - Dalal & Triggs (2005)Feature based methodsGLOH (Gradient Location and Orientation Histogram) is a robust image descriptor that can be used in computer vision tasks. It is a SIFT-like descriptor that considers more spatial regions for the histograms. The higher dimensionality of the descriptor is reduced to 64 through principal components analysis (PCA).

Histogram of Oriented Gradients (HOG). The technique counts occurrences of gradient orientation in localized portions of an image. differs in that it is computed on a dense grid of uniformly spaced cells and uses overlapping local contrast normalization for improved accuracy

6

AgendaWe will concentrate on feature based methods using SIFT for features extraction and RANSAC for features matching and transformation estimationSome BackgroundSIFT and RANSACWhat is SIFT?From Wiki: an algorithm in computer vision to detect and describe local features in images. The algorithm was published by David Lowe in 1999

Scale Invariant Features Transform

ApplicationsObject recognitionRobotic mapping and navigationImage stitching3D modelingGesture recognitionVideo trackingIndividual identification of wildlifeMatch movingBasic StepsScale Space extrema detectionConstruct Scale SpaceTake Difference of GaussiansLocate DoG ExtremaKeypoint localizationOrientation assignmentBuild Keypoint Descriptors

*http://www.csie.ntu.edu.tw/~cyy/courses/vfx/05spring/lectures/handouts/lec04_feature.pdf

First OctaveSecond Octave*copied from Hagit Hel-Or ppt1a. Construct Scale SpaceExplanation:representing an image at different scales at different blurred levelsMotivation:Real-world objects are composed of different structures at different scalesFor example a tree can be examined at leaf level where we can see its texture, or from a large distance where it can be seen as a dot.Studies also show a close link between scale-space and biological vision.

121b. Take Difference of Gaussians*Mikolajczyk 2002

*Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe1c. Locate DoG ExtremaFind all Extrema, that is minimum or maximum in 3x3x3 neighborhood:

*Distinctive Image Features from Scale-Invariant Keypoints David G. LoweBasic StepsScale Space extrema detection Keypoint localizationSub Pixel Locate Potential Feature PointsFilter Edge and Low Contrast ResponsesOrientation assignmentBuild Keypoint Descriptors

*http://www.csie.ntu.edu.tw/~cyy/courses/vfx/05spring/lectures/handouts/lec04_feature.pdf2a. Sub Pixel Locate Potential Feature Points

*http://www.inf.fu-berlin.de/lehre/SS09/CV/uebungen/uebung09/SIFT.pdf2b. Filter Edge and Low Contrast Responses

*http://www.inf.fu-berlin.de/lehre/SS09/CV/uebungen/uebung09/SIFT.pdf2b. Filter Edge and Low Contrast ResponsesEigenvalues of Hessian matrix must both be largetheHessian matrix(or simply theHessian) is thesquare matrixof second-orderpartial derivativesof afunction; that is, it describes the local curvature of a function of many variablesAn eigenvector of a square matrix is a non-zero vector that, when multiplied by , yields the original vector multiplied by a single number ; that is:

18Picture worth a 1000 keypointsOriginal image

Low contrast removed (729)

Initial features (832)

Low curvature removed (536)

*Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe

Basic StepsScale Space extrema detection Keypoint localization Orientation assignmentBuild Keypoint Descriptors

Low contrast removed Low curvature removed *http://www.csie.ntu.edu.tw/~cyy/courses/vfx/05spring/lectures/handouts/lec04_feature.pdf3. Orientation assignment*http://www.inf.fu-berlin.de/lehre/SS09/CV/uebungen/uebung09/SIFT.pdf

Basic StepsScale Space extrema detection Keypoint localization Orientation assignment Build Keypoint Descriptors

*http://www.csie.ntu.edu.tw/~cyy/courses/vfx/05spring/lectures/handouts/lec04_feature.pdf4. Build Keypoint Descriptors*Image from: Jonas Hurrelmann

Live Demo

And nextRANSAC

*http://habrahabr.ru/post/106302/What is RANSAC?first published by Fischler and Bolles at SRI International in 1981From Wiki:An iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliersNon DeterministicOutputs a reasonable result with certain probability

RANdom SAmple Consensus

What is RANSAC?

A data set with many outliers for which a line has to be fittedFitted line with RANSACoutliers have no influence on the result*http://en.wikipedia.org/wiki/RANSACRANSAC Input & OutputThe procedure is iterated k times, for each iteration:InputSet of observed data valuesParameterized model which can explain or be fitted to the observationsSome confidence parametersOutputBest model - model parameters which best fit the data (or nil if no good model is found) Best consensus set - data points from which this model has been estimated Best error - the error of this model relative to the datan - the minimum number of data required to fit the model k - the number of iterations performed by the algorithm t - a threshold value for determining when a datum fits a model d - the number of close data values required to assert that a model fits well to data27Basic StepsSelect a random subset of the original data called hypothetical inliersFill free parameters according to the hypothetical inliers creating suggested model.Test all non hypothetical inliers in the model, if a point fits well, also consider as a hypothetical inlier.Check that suggested model has sufficient points classified as hypothetical inliers.Recheck free parameters according to the new set of hypothetical inliers.Evaluate the error of the inliers relative to the model.Basic Steps Line Fitting Example

*copied from Hagit Hel-Or pptBasic Steps Line Fitting Example




*copied from Hagit Hel-Or pptBasic Steps Line Fitting ExampleEvaluate the error of the inliers relative to the model.

C=3



*copied from Hagit Hel-Or pptAn example from image mosaicingEstimate transformationBack to Image MosaicingHow it is done?For each pair of images: Extract featuresMatch features Estimate transformation Transform 2nd image Blend two images Repeat for next pair

*Automatic Panoramic Image Stitching using Invariant Features M. Brown * D.G. Lowe1. Extract featuresNeed to match points from different imagesDifferent orientationsDifferent scalesDifferent illuminationsChallenges

1. Extract featuresSIFT - David Lowe (1999)PCA-SIFT - Y. Ke & R. Sukthankar (2004)SURF - Bay & Tuytelaars (2006)Contenders for the crown

411. Extract featuresUsed to lower the dimensionality of a dataset with a minimal information lossCompute or load a projection matrix using set of images which match a certain characteristics

*http://homepages.dcc.ufmg.br/~william/papers/paper_2012_CIS.pdf SIFT or PCA-SIFT

PCA-SIFT: computing a projection matrix Select a representative set of pictures and detect all keypoints in these pictures ~20-40K For each keypoint: Extract an image patch around it with size 41 x 41 pixels Calculate horizontal and vertical gradients, resulting in a vector of size 39 x 39 x 2 = 3042 Put all these vectors into a k x 3042 matrix A where k is the number of keypoints detected Calculate the covariance matrix of A Compute the eigenvectors and eigenvalues of covA Select the first n eigenvectors; the projection matrix is a n x 3042 matrix composed of these eigenvectors n can either be a fixed value determined empirically or set dynamically based on the eigenvalues The projection matrix is only computed once and saved

421. Extract featuresDetect keypoints in the image same as SIFTExtract a 4141 patch centered over each keypoint, compute its local image gradientProject the gradient image vector by multiplying with the projection matrix - to derive a compact feature vector.This results in a descriptor of size n

image mosaicing

Documents

matching features

scale invariant features

local features

distinctive image features

features extraction

robust image descriptor

different structures

david lowe