cse 252c: advanced computer visioncseweb.ucsd.edu/~mkchandraker/classes/cse252c/... · •...

90
Lecture 2: Correspondence, Metric Learning CSE 252C: Advanced Computer Vision Manmohan Chandraker CSE 252C, SP20: Manmohan Chandraker

Upload: others

Post on 09-Aug-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Lecture2:Correspondence,MetricLearning

CSE252C:AdvancedComputerVision

ManmohanChandraker

CSE252C,SP20:ManmohanChandraker

Page 2: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Virtualclassrooms• VirtuallecturesonZoom

– Onlyhostsharesthescreen

– Keepvideooffandmicrophonemuted

– Butpleasedospeakup(remembertounmute!)

• VirtualinteractionsonZoom

– Askandanswerplentyofquestions

– “Raisehand”featureonZoomwhenyouwishtospeak

– Postquestionsonchatwindow

– TAwillhelpkeeptrackofraisedhandsandchatwindow

• LecturesrecordedanduploadonKaltura

– Availableunder“MyMedia”onCanvas

CSE252C,SP20:ManmohanChandraker

Page 3: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Enrollmentlogistics• Ifonwaitlist,needtosend“Requesttoenroll” email

– Statewhetheryousatisfythelistedpre-requisitecourserequirements

– Explainwhetheryouhaverequiredbackground(courses,projects)

• Ifnotclearedtoenroll,orwhileonthewaitlist

– Youarewelcometoattendlectures

– TolimitTAworkload,wecangradeonlyenrolledstudents

• AllannouncementswillbepostedonPiazza

– SendemailtoTA(CCinstructor)ifdidnotgetPiazzainvite

• MSComps:Hostedasthefinalexamofthecourse

• 2unitsorS-U

– SendinstructoranemailandCCTAs

– Assignmentsoptional,gradedonfinalexamsandpresentationCSE252C,SP20:ManmohanChandraker

Page 4: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Coursedetails• Classwebpage:

– http://cseweb.ucsd.edu/~mkchandraker/classes/CSE252C/Spring2020/

• Instructoremail:

[email protected]

• TAs:Zhengqin LiandYou-YiJau

– Emails:[email protected] [email protected]

• Grading

– 10%presentation

– 60%assignments

– 30%finalexam

• Aimistolearntogether,discussandhavefun!

CSE252C,SP20:ManmohanChandraker

Page 5: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Coursedetails

• “Lightning”presentations

– Fourstudentstopresentinoneclass

– Timelimit:5minutes

– Paperstobeassignedbyinstructor

– Orderofpresentation:alphabetic

– https://docs.google.com/spreadsheets/d/1VcdhEKmvDqLFfYIYh7x6oO

AqiVNfdbc6JLD5omBpXIA/edit#gid=0

• Sendpresentation1daybeforeclass

– Well-practicedandfluentpresentation

– Includenarrationifasynchronous

– Askandanswerquestionsafterpresentation

CSE252C,SP20:ManmohanChandraker

Page 6: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Coursedetails

• Presentationformat(1slideforeach):

1.Motivationandproblemdescription

2.Priorwork

3.Methodoverview

4.Methodanalysis

5.Experiments

6.Futureworkanddiscussion

CSE252C,SP20:ManmohanChandraker

Page 7: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

• SuperPoint:Self-Supervised InterestPointDetection and Description

– https://arxiv.org/abs/1712.07629

• WarpNet:Weakly SupervisedMatching for Single-viewReconstruction

– https://arxiv.org/abs/1604.05592

• AnchorNet:AWeakly Supervised Networkto Learn Geometry-Sensitive

Featuresfor SemanticMatching

– https://arxiv.org/abs/1704.04749

• SIFTFlow:DenseCorrespondenceacrossDifferentScenes

– http://people.csail.mit.edu/celiu/ECCV2008/

PapersforWed,Apr15

CSE252C,SP20:ManmohanChandraker

Page 8: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Correspondence

CSE252C,SP20:ManmohanChandraker

Page 9: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

WhyDoWeHaveTwoEyes?

Depthinformationlost

inimageformation

Binocular(stereo)vision

enablesdepthestimation

CSE252C,SP20:ManmohanChandraker

Page 10: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Howdoweperceivedepthinimages?

Relateprojectionsofthesamepoint intwoormoreimagesofthescene.

CSE252C,SP20:ManmohanChandraker

Page 11: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Howdoweperceivedepthinimages?

Relateintensitiesofimagepointsintwoormoreimages.

CSE252C,SP20:ManmohanChandraker

Page 12: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Howdoweperceivedepthinimages?

Usepriorinformation intheformofsemantics,function,affordance.

CSE252C,SP20:ManmohanChandraker

Page 13: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Visualcorrespondenceaids3Dreconstruction

Akeyproblem in3Dreconstruction: relatesimilarelementsacrossasetofimages

Geometric Photometric Semantic

CSE252C,SP20:ManmohanChandraker

Page 14: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Simple matching methodsInterestpoint:• Localizedposition

• Informativeaboutimagecontent

• Repeatableundervariations

CSE252C,SP20:ManmohanChandraker

Page 15: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Simple matching methods

W1(x,y):k xk pixelpatchinimage1

W2(x,y):k xk pixelpatchinimage2

Descriptor:• Function appliedonW1 andW2,to

enablecomparing them

Interestpoint:• Localizedposition

• Informativeaboutimagecontent

• Repeatableundervariations

CSE252C,SP20:ManmohanChandraker

Page 16: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Simple matching methods• SSD (Sum of Squared Differences)

W1(x,y):k xk pixelpatchinimage1

W2(x,y):k xk pixelpatchinimage2

CSE252C,SP20:ManmohanChandraker

Page 17: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Simple matching methods• SSD (Sum of Squared Differences)

• NCC (Normalized Cross Correlation)

• What advantages might NCC have over SSD?

,

(Mean) (Standarddeviation)

W1(x,y):k xk pixelpatchinimage1

W2(x,y):k xk pixelpatchinimage2

CSE252C,SP20:ManmohanChandraker

Page 18: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Featuredistance:threshold

• Thedistancethresholdaffectsperformance

• Onlymatcheswithdistancelessthanthresholdareallowed

• Truepositives=numberofdetectedmatchesthatarecorrect

– Supposewewanttomaximizethese— howtochoosethreshold?

• Falsepositives=numberofdetectedmatchesthatareincorrect

– Supposewewanttominimizethese— howtochoosethreshold?

5075

200

feature distance

false match

true match

CSE252C,SP20:ManmohanChandraker

Page 19: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Choosingamatch:ratiotest

• Firstapproach:useSSD=

• Betterapproach:ratiodistance=

– f2 isbestSSDmatchtof1 inI2

– f2’issecondbestSSDmatchtof1 inI2

– Giveslargevalues(closeto1)forambiguousmatches.

I1 I2

f1 f2f2'

CSE252C,SP20:ManmohanChandraker

Page 20: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SIFT

CSE252C,SP20:ManmohanChandraker

Page 21: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Desirableproperty:invariance

Findfeaturesthatareinvarianttotransformations

– geometricinvariance:translation, rotation,scale

– photometric invariance:brightness, exposure,…

CSE252C,SP20:ManmohanChandraker

Page 22: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Idea of SIFT n For better image matching, need to develop an interest

operator invariant to scale and rotation.

n Also, need a descriptor robust to typical variations. The descriptor is the most-used part of SIFT.

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 23: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Overall Procedure at a High Level1. Scale-space extrema detection

2. Keypoint localization

3. Orientation assignment

4. Keypoint description

Search over multiple scales and image locations.

Fit a model to determine location and scale.Select keypoints based on a measure of stability.

Compute best orientation(s) for each keypoint region.

Use local image gradients at selected scale and rotationto describe each keypoint region.

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 24: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

1. Scale-space extrema detectionn Goal: Identify locations and scales that can be

reliably assigned under different views of the same scene or object.

n Method: search for stable features across multiple scales using a continuous function of scale.

n The scale space of an image is a function L(x,y,s)that is produced from the convolution of a Gaussian kernel (at different scales) with the input image.

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 25: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Aside: Gaussian Smoothing

A1DGaussianwithmean0,variance1

A2DGaussianwithmean(0,0),variance1

CSE 252C, SP20: Manmohan Chandraker

Page 26: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Example: Gaussian Smoothed Scale Space

CSE 252C, SP20: Manmohan Chandraker

• Scalespaceaxioms:linearity,shiftinvariant,rotationinvariant,nospuriousextrema• Gaussianfilteruniquely satisfiesaxioms

Page 27: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Aside: Image Pyramids

Bottom level is the original image.

2nd level is derived from theoriginal image according tosome function

3rd level derived from 2nd level according to same function

And so on.

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 28: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Aside: Gaussian PyramidAt each level, image is smoothed and reduced in size.

Bottom level is the original image.

At 2nd level, each pixel is the resultof applying a Gaussian mask tothe first level and then subsamplingto reduce the size.

And so on.

Apply Gaussian filter

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 29: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Example: Subsampling with Gaussian Pre-filtering

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 30: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Scale Space Pyramid

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

s+2 filtersss+1=2(s+1)/ss0

.

.si=2i/ss0

.

.s2=22/ss0

s1=21/ss0

s0

s+3imagesincludingoriginal

s+2differenceimages

The parameter s determines the number of images per octave.

Page 31: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

2. Key point localization

n Detect maxima and minima of difference-of-Gaussian in scale space

n Each point is compared to its 8 neighbors in the current image and 9 neighbors each in the scales above and below For each max or min found,

output is the location andthe scale.

s+2 difference images.top and bottom ignored.s planes searched.

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 32: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Difference of Gaussiansn Scale-space detection

q Find local maxima across scale/spaceq A good “blob” detector

[ T. Lindeberg IJCV 1998 ]CSE 252C, SP20: Manmohan Chandraker

Page 33: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

3. Orientation assignment

n Create histogram of local gradient directions at selected scale

n Assign canonical orientation at peak of smoothed histogram

If 2 major orientations, use both.

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 34: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Keypoint localization with orientation

832

729536

233x189keypoints

keypoints aftergradient threshold

keypoints afterratio threshold

CSE 252C, SP20: Manmohan Chandraker

Page 35: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

4. Keypoint Descriptors

n At this point, each keypoint hasq locationq scaleq orientation

n Next is to compute a descriptor for the local image region about each keypoint that isq highly distinctiveq as invariant as possible to variations such as changes in

viewpoint and illumination

CSE 252C, SP20: Manmohan Chandraker [Adaptedfrom:D.Lowe]

Page 36: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Normalizationn Rotate the window to standard orientation

n Scale the window size based on the scale at which the point was found.

CSE 252C, SP20: Manmohan Chandraker

Page 37: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SIFT Keypoint Descriptor(shown with 2 X 2 descriptors)

In implementation, 4x4 arrays of 8 bin histogram are used, a total of 128 features for one keypoint.

CSE 252C, SP20: Manmohan Chandraker

Page 38: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SIFT for Correspondence

[Codeandtutorial:https://www.vlfeat.org/overview/sift.html]CSE 252C, SP20: Manmohan Chandraker

Page 39: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SIFT for Correspondence

CSE 252C, SP20: Manmohan Chandraker[Codeandtutorial:https://www.vlfeat.org/overview/sift.html]

Page 40: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SIFT for Correspondence

CSE 252C, SP20: Manmohan Chandraker[Codeandtutorial:https://www.vlfeat.org/overview/sift.html]

Page 41: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Uses for SIFT

n Feature points are used also for:q Panorama stitchingq 3D reconstruction q Motion trackingq Object recognitionq Indexing and database retrievalq Robot navigationq … many others

CSE 252C, SP20: Manmohan Chandraker

Page 42: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Cases where SIFT does not work

q Strong illumination changesq Large out-of-plane rotationsq Non-rigid deformations or articulationsq Semantic correspondence

CSE 252C, SP20: Manmohan Chandraker

Page 43: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

LearningCorrespondence

CSE252C,SP20:ManmohanChandraker

Page 44: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Similar?

Image courtesy of [Vedaldi et al., VLFeat]

Measuringpatchsimilarity

Page 45: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

• HandDesignedFeatures

- SIFT,SURF,ORB,FREAK,DAISY

• NeuralNetworks

- ComparePatches [Zagoruyko.etal.]

- MatchNet [Hanetal.]

- SeebyMoving [Agrawaletal.]

- StereoMatchingCost [Zbontaretal.]

Agrawal et al., ICCV 2015, Han et al., CVPR 2015, Zagoruyko et al., CVPR 2015, Zbontar et al., CVPR 2015

Similarity

CNN CNN

FC Layers

• GlobalOptimization

- Flow,DSP,etc.

Revaud et al., ICCV 2013, Liu et al. PAMI 2011Lowe et al., ICCV 1999, Bay et al., ECCV 2006, Rublee et al., ICCV 2011, Alahi et al., ICCV 2012, Tola et al., CVPR 2008

Measuringpatchsimilarity

Page 46: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SpatiallocalizationinclassificationCNNs

[Long et al., “Do Convnets Learn Correspondence?”, NeurIPS 2014]

Page 47: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SpatiallocalizationinclassificationCNNs

Page 48: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SpatiallocalizationinclassificationCNNs

Page 49: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SpatiallocalizationinclassificationCNNs

Page 50: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Idea:• Siamesenetworktodecidepatchsimilarity

• Useintermediateactivationsasfeatures.

Advantages:• Easytotrain

Issues:• Inefficient:extractfeaturesforoverlapping

regionswithinpatches

• O(n2)feed-forwardpassestocomparenpatchesineachimage.

[Zagoruyko andKomodakis, CVPR2015]

CNNsforlearningcorrespondence

CSE252C,SP20:ManmohanChandraker

Page 51: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

- Learn a feature space directly optimized for correspondence

- Intermediate activations of patch similarity are surrogate features

- Mapping an image to a metric space- Metric Space: Distance relationship = Class membership

Needforametricincorrespondencelearning

Page 52: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

MetricLearning

CSE252C,SP20:ManmohanChandraker

Page 53: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric

MetricLearning

• Ametricquantifiesdistancebetweenanytwomembersofaset

• Inducesasimilaritymeasure

CSE252C,SP20:ManmohanChandraker

Page 54: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric

• Usefulfor,say,k-nearestneighborsclassification• Foreverytrainingsample,wantk closestsamplestobefromthesameclass

CSE252C,SP20:ManmohanChandraker

Page 55: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Nearestneighbors

• Usefulfor,say,k-nearestneighborsclassification• Foreverytrainingsample,wantk closestsamplestobefromthesameclass

• Nearestneighborclassification:

• Assignlabelofnearesttrainingdatapointtoeachtestdatapoint

Voronoi partitioning offeaturespacefor2-category2Ddata

[FigurefromDuda etal.]

Noveltestexample

Closesttoapositiveexamplefromthe

trainingset,soclassify

itaspositive.

Black=negativeRed=positive

CSE252C,SP20:ManmohanChandraker

Page 56: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Nearestneighbors

• Usefulfor,say,k-nearestneighborsclassification• Foreverytrainingsample,wantk closestsamplestobefromthesameclass

• Nearestneighborclassification:

• Assignlabelofnearesttrainingdatapointtoeachtestdatapoint

• Extendtok-nearestneighborclassification:• Foranewpoint,findthekclosestpointsfromtrainingdata

• Labelsofthekpoints“vote”toclassify

k=5

Ifquery landshere,5NNconsistof3negatives

and2positives, sowe

classifyasnegative.

Black=negativeRed=positive

[FigurefromD.Lowe]CSE252C,SP20:ManmohanChandraker

Page 57: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Nearestneighbors

• Usefulfor,say,k-nearestneighborsclassification• Foreverytrainingsample,wantk closestsamplestobefromthesameclass

• Nearestneighborclassification:

• Assignlabelofnearesttrainingdatapointtoeachtestdatapoint

• Extendtok-nearestneighborclassification:• Foranewpoint,findthekclosestpointsfromtrainingdata

• Labelsofthekpoints“vote”toclassify

• Advantages:• Simpletoimplement

• Flexibletofeatureordistancechoices

• Candowellinpracticewithenoughrepresentativedata

CSE252C,SP20:ManmohanChandraker

Page 58: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Nearestneighbors

• Usefulfor,say,k-nearestneighborsclassification• Foreverytrainingsample,wantk closestsamplestobefromthesameclass

• Nearestneighborclassification:

• Assignlabelofnearesttrainingdatapointtoeachtestdatapoint

• Extendtok-nearestneighborclassification:• Foranewpoint,findthekclosestpointsfromtrainingdata

• Labelsofthekpoints“vote”toclassify

• Advantages:• Simpletoimplement

• Flexibletofeatureordistancechoices

• Candowellinpracticewithenoughrepresentativedata

• Limitations:• Largesearchproblemtofindnearestneighbors

• Storageofdata

• MustknowwehaveameaningfuldistancefunctionCSE252C,SP20:ManmohanChandraker

Page 59: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric

• Ametricquantifiesdistancebetweenanytwomembersofaset

• Inducesasimilaritymeasure

• Propertiesofametric

• Non-negativity:! ", $ ≥ 0• Identity:! ", $ = 0 ifandonlyif" = $• Symmetry: ! ", $ = !($, ")• Triangleinequality:! ", $ + ! $, + ≥ !(", +)

• Exampleofmetric:! ", $ =∥ " − $ ∥• Exampleofnon-metric:! ", $ = "2$2

Metric

Learning

CSE252C,SP20:ManmohanChandraker

Page 60: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric

• Ametricquantifiesdistancebetweenanytwomembersofaset

• Inducesasimilaritymeasure

• Propertiesofametric

• Non-negativity:! ", $ ≥ 0• Identity:! ", $ = 0 ifandonlyif" = $• Symmetry: ! ", $ = !($, ")• Triangleinequality:! ", $ + ! $, + ≥ !(", +)

• Sometimes,thechoiceofmetricistrivial

• Depthestimation:distancesin3D,soEuclideandistanceagoodmetric

CSE252C,SP20:ManmohanChandraker

Page 61: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric

• Ametricquantifiesdistancebetweenanytwomembersofaset

• Inducesasimilaritymeasure

• Propertiesofametric

• Non-negativity:! ", $ ≥ 0• Identity:! ", $ = 0 ifandonlyif" = $• Symmetry: ! ", $ = !($, ")• Triangleinequality:! ", $ + ! $, + ≥ !(", +)

• Sometimes,thechoiceofmetricistrivial

• Othertimes,itisnotimmediatelyclearhowtodefineametric

• Findthetwofacesthatareclosesttoeachother

CSE252C,SP20:ManmohanChandraker

Page 62: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric:distancesinfeaturespace

• Trainingdatacomposedof(image,label)pairs

• (Image1,dog),(Image2,cat),(Image3,dog),....

• Classifytestimageasdogorcat

CSE252C,SP20:ManmohanChandraker

Page 63: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric:distancesinfeaturespace

• Representimagesasfeatures,canfinddistancebetweenfeatures

Feature1

Feature2

0

CSE252C,SP20:ManmohanChandraker

Page 64: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric:distancesinfeaturespace

• Representimagesasfeatures,canfinddistancebetweenfeatures

• Mapnewimagetofeaturespace,finddistances

??Feature1

Feature2

CSE252C,SP20:ManmohanChandraker

Page 65: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric:distancesinfeaturespace

• Representimagesasfeatures,canfinddistancebetweenfeatures

• Mapnewimagetofeaturespace,finddistances

??

Feature1

Feature2

CSE252C,SP20:ManmohanChandraker

Page 66: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric:distancesinfeaturespace

• Representimagesasfeatures,canfinddistancebetweenfeatures

• Mapnewimagetofeaturespace,finddistances

• Notalldimensionsareequallyuseful!

Feature1

Feature2

Useful

Notuseful

??

??

CSE252C,SP20:ManmohanChandraker

Page 67: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metric:distancesinfeaturespace

• Representimagesasfeatures,canfinddistancebetweenfeatures

• Mapnewimagetofeaturespace,finddistances

• Notalldimensionsareequallyuseful!

• Idea: changehowwemeasuredistance

Feature1

Feature2

Useful

Notuseful

??

CSE252C,SP20:ManmohanChandraker

Page 68: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metriclearning

• Givendata,learnametricM thathelpspredictionusingadistancefunction

• Goal: FindametricM suchthat

• Samplesfromsameclass(positives)havelow01• Samplesfromdifferentclasses(negatives,impostors)havehigh01

• Approach: Poseasanoptimizationproblem

??

CSE252C,SP20:ManmohanChandraker

Page 69: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metriclearning

• Givendata,learnametricM thathelpspredictionusingadistancefunction

• Createtwosets,similarpairsS anddissimilarpairsD

CSE252C,SP20:ManmohanChandraker

Page 70: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metriclearning

• Givendata,learnametricM thathelpspredictionusingadistancefunction

• Createtwosets,similarpairsS anddissimilarpairsD• FindM suchthat

CSE252C,SP20:ManmohanChandraker

Page 71: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metriclearning

• Givendata,learnametricM thathelpspredictionusingadistancefunction

• Createtwosets,similarpairsS anddissimilarpairsD• FindM suchthat

• MinimizeanobjectivefunctionwithrespecttoM:

CSE252C,SP20:ManmohanChandraker

Page 72: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metriclearning

• Givendata,learnametricM thathelpspredictionusingadistancefunction

• Createtwosets,similarpairsS anddissimilarpairsD• FindM suchthat

• MinimizeanobjectivefunctionwithrespecttoM:

• Anoptimizationproblem:findstationarypoints!

CSE252C,SP20:ManmohanChandraker

Page 73: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Metriclearning

• Givendata,learnametricM thathelpspredictionusingadistancefunction

• Createtwosets,similarpairsS anddissimilarpairsD• FindM suchthat

• Advancedvariant:

• Convexoptimization(semidefiniteprogram),efficientlysolvable!

CSE252C,SP20:ManmohanChandraker

Page 74: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Largemarginnearestneighbors

• Targetneighbors ofpointxi:pointsxjwithsimilarlabels

• Impostors arepointsxl closetoxi,butwithdifferentlabels• Goal: pull targetscloser,push impostorsfartheraway

CSE252C,SP20:ManmohanChandraker

Page 75: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Largemarginnearestneighbors

• Targetneighbors ofpointxi:pointsxjwithsimilarlabels

• Impostors arepointsxl closetoxi,butwithdifferentlabels• Goal: pull targetscloser,push impostorsfartheraway

• Givenpointi,similarpointsj,impostorpointsl :

CSE252C,SP20:ManmohanChandraker

Page 76: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Largemarginnearestneighbors

Localconstraints:directlyimproveneighbor quality

CSE252C,SP20:ManmohanChandraker

Page 77: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Largemarginnearestneighbors

CSE252C,SP20:ManmohanChandraker

Testimage

Largemarginnearestneighbor

Euclideandistancenearestneighbor

Testimage

Largemarginnearestneighbor

Euclideandistancenearestneighbor

[Weinbergeretal.,2009]

Page 78: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

Generalrecipeformetriclearning

CSE252C,SP20:ManmohanChandraker

Page 79: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

LDAasmetriclearning

CSE252C,SP20:ManmohanChandraker

Page 80: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SiameseCNNsfordistancemetriclearning

• AdvantageofLDA:linearproblem!

• AdvantageofLMNN:convexproblem!

• Disadvantages:featurerepresentationislearnedindependentofthemetric

• AdvantageofCNNs:canoptimizefeaturesconditionedonsimilaritymeasure

CSE252C,SP20:ManmohanChandraker

Page 81: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SiameseCNNsfordistancemetriclearning

• AdvantageofLDA:linearproblem!

• AdvantageofLMNN:convexproblem!

• Disadvantages:featurerepresentationislearnedindependentofthemetric

• AdvantageofCNNs:canoptimizefeaturesconditionedonsimilaritymeasure

CSE252C,SP20:ManmohanChandraker

Page 82: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

SiameseCNNforpatchsimilarity

||D(x1)– D(x2)||2

Simo-Serra,E.,Trulls,E.,Ferraz,L.,Kokkinos, I.,Fua,P.andMoreno-Noguer,F.,2015.Discriminativelearningofdeepconvolutional

featurepointdescriptors.In ProceedingsoftheIEEEInternationalConferenceonComputerVision (pp.118-126).CSE252C,SP20:ManmohanChandraker

Issuewithusingthislossfortraining?

Page 83: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

LossfunctionforSiameseCNNs

Thefinallossisdefinedas:

L=∑lossofpositivepairs+∑lossofnegativepairs

CSE252C,SP20:ManmohanChandraker

Page 84: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

LossfunctionforSiameseCNNs

Bell,S.andBala,K.,2015.Learningvisual similarityforproductdesignwithconvolutional neuralnetworks.ACMTransactions onGraphics(TOG),34(4), p.98.

Wecanusedifferentlossfunctionsforthetwotypesofinputpairs.

• Typicalpositivepair (xp,xq)loss:L(xp,xq)=||xp – xq||2

(EuclideanLoss)

CSE252C,SP20:ManmohanChandraker

Page 85: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

LossfunctionforSiameseCNNs

Bell,S.andBala,K.,2015.Learningvisual similarityforproductdesignwithconvolutional neuralnetworks.ACMTransactions onGraphics(TOG),34(4), p.98.

• Typicalnegativepair(xn,xq)loss:

L(xn,xq)=max(0,m2 - ||xn – xq||2) (HingeLoss)

CSE252C,SP20:ManmohanChandraker

Page 86: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

LossfunctionforSiameseCNNs

• Combinedintoacontrastiveloss

• Forpairoftrainingexamplesx1 andx2 (withlabelsy1,y2):

L(x1,x2)=s12||x1 – x2||2 +(1– s12)max(0,m2 - ||x1 – x2||

2),

s12 =1wheny1 =y2,

otherwises12 =0.

CSE252C,SP20:ManmohanChandraker

Page 87: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

TripletLossforMetricLearning

• Ateachtrainingiteration,sampleamini-batchoftriplets

• Goal:pushnegativeamarginfurtherfromanchorthanpositive

• Tripletloss:

Anchor Positive Negative

CSE252C,SP20:ManmohanChandraker

Page 88: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

TripletLossforMetricLearning

• Tripletloss:

• Gradients:

CSE252C,SP20:ManmohanChandraker

Page 89: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

TripletLossforMetricLearning

• Tripletloss:

• Gradients:

• Issues:• Fixedparameterm,whileintra-classdistancescanvary

• Inefficienttoconsideralltriplets

CSE252C,SP20:ManmohanChandraker

Page 90: CSE 252C: Advanced Computer Visioncseweb.ucsd.edu/~mkchandraker/classes/CSE252C/... · • Presentation format (1 slide for each): 1. Motivation and problem description 2. Prior work

AlternativestoSiamesenetworks

CSE252C,SP20:ManmohanChandraker