cs 496: computer vision thanks to chris bregler. cs 496: computer vision personnelpersonnel –...
TRANSCRIPT
CS 496: Computer VisionCS 496: Computer Vision
Thanks to Chris BreglerThanks to Chris Bregler
CS 496: Computer VisionCS 496: Computer Vision
• PersonnelPersonnel– Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz
[email protected]@cs.princeton.edu
– TA: Wagner CorrêaTA: Wagner Corrê[email protected]@cs.princeton.edu
– Email to bothEmail to [email protected]@princeton.edu
• Course web pageCourse web page
http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/
What is Computer Vision?What is Computer Vision?
• Input: images or videoInput: images or video
• Output: description of the worldOutput: description of the world
What is Computer Vision?What is Computer Vision?
• Input: images or videoInput: images or video
• Output: description of the worldOutput: description of the world– Many levels of descriptionMany levels of description
Low-Level or “Early” VisionLow-Level or “Early” Vision
• Considers local Considers local properties of an properties of an imageimage
““There’s an edge!”There’s an edge!”
Mid-Level VisionMid-Level Vision
• Grouping and Grouping and segmentationsegmentation
““There’s an object There’s an object and a background!”and a background!”
High-Level VisionHigh-Level Vision
• RecognitionRecognition
““It’s a chair!”It’s a chair!”
Big Question #1: Who Cares?Big Question #1: Who Cares?
• Applications of computer visionApplications of computer vision– In AI: vision serves as the “input stage”In AI: vision serves as the “input stage”– In medicine: understanding human In medicine: understanding human
visionvision– In engineering: model extractionIn engineering: model extraction
Vision and Other FieldsVision and Other Fields
Computer VisionComputer VisionArtificial IntelligenceArtificial Intelligence
Cognitive PsychologyCognitive Psychology Signal ProcessingSignal Processing
Computer GraphicsComputer Graphics
Pattern AnalysisPattern Analysis
MetrologyMetrology
Big Question #2: Does It Big Question #2: Does It Work?Work?
• Situation much the same as AI:Situation much the same as AI:– Some fundamental algorithmsSome fundamental algorithms– Large collection of hacks / heuristicsLarge collection of hacks / heuristics
• Vision is hard!Vision is hard!– Especially at high level, physiology Especially at high level, physiology
unknownunknown– Requires integrating many different Requires integrating many different
methodsmethods– Requires reasoning and understanding:Requires reasoning and understanding:
“AI completeness”“AI completeness”
Computer and Human VisionComputer and Human Vision
• Emulating effects of human visionEmulating effects of human vision
• Understanding physiology of human Understanding physiology of human visionvision
Image FormationImage Formation
• Human: lens forms Human: lens forms image on retina,image on retina,sensors (rods and sensors (rods and cones) respond to cones) respond to lightlight
• Computer: lens Computer: lens system forms image,system forms image,sensors (CCD, CMOS) sensors (CCD, CMOS) respond to lightrespond to light
Low-Level VisionLow-Level Vision
HubelHubel
Low-Level VisionLow-Level Vision
• Retinal ganglion cellsRetinal ganglion cells
• Lateral Geniculate Nucleus – function Lateral Geniculate Nucleus – function unknown (visual adaptation?)unknown (visual adaptation?)
• Primary Visual CortexPrimary Visual Cortex– Simple cells: orientational sensitivitySimple cells: orientational sensitivity– Complex cells: directional sensitivityComplex cells: directional sensitivity
• Further processingFurther processing– Temporal cortex: what is the object?Temporal cortex: what is the object?– Parietal cortex: where is the object? How do I Parietal cortex: where is the object? How do I
get it?get it?
Low-Level VisionLow-Level Vision
• Net effect: low-level human visionNet effect: low-level human visioncan be (partially) modeled as a set ofcan be (partially) modeled as a set ofmultiresolution, orientedmultiresolution, oriented filters filters
Low-Level Depth CuesLow-Level Depth Cues
• FocusFocus
• VergenceVergence
• StereoStereo
• Not as important as popularly Not as important as popularly believedbelieved
Low-Level Computer VisionLow-Level Computer Vision
• Filters and filter banksFilters and filter banks– Implemented via convolutionImplemented via convolution– Detection of edges, corners, and other local Detection of edges, corners, and other local
featuresfeatures– Can include multiple orientationsCan include multiple orientations– Can include multiple scales: “filter pyramids”Can include multiple scales: “filter pyramids”
• ApplicationsApplications– First stage of segmentationFirst stage of segmentation– Texture recognition / classificationTexture recognition / classification– Texture synthesisTexture synthesis
Texture Analysis / SynthesisTexture Analysis / Synthesis
MultiresolutionMultiresolutionOrientedOrientedFilter BankFilter Bank
OriginalOriginalImageImage
ImageImagePyramidPyramid
Texture Analysis / SynthesisTexture Analysis / Synthesis
OriginalOriginalTextureTexture
SynthesizedSynthesizedTextureTexture
Heeger and BergenHeeger and Bergen
Low-Level Computer VisionLow-Level Computer Vision
• Optical flowOptical flow– Detecting frame-to-frame motionDetecting frame-to-frame motion– Local operator: looking for gradientsLocal operator: looking for gradients
• ApplicationsApplications– First stage of trackingFirst stage of tracking
Optical FlowOptical Flow
Image #1Image #1 Optical FlowOptical FlowFieldField
Image #2Image #2
Low-Level Computer VisionLow-Level Computer Vision
• Shape from XShape from X– StereoStereo– MotionMotion– ShadingShading– Texture foreshorteningTexture foreshortening
3D Reconstruction3D Reconstruction
Tomasi+KanadeTomasi+Kanade
Debevec,Taylor,MalikDebevec,Taylor,Malik Phigin et al.Phigin et al.
Forsyth et al.Forsyth et al.
Mid-Level VisionMid-Level Vision
• Physiology unclearPhysiology unclear
• Observations by Gestalt psychologistsObservations by Gestalt psychologists– ProximityProximity– SimilaritySimilarity– Common fateCommon fate– Common regionCommon region– ParallelismParallelism– ClosureClosure– SymmetrySymmetry– ContinuityContinuity– Familiar configurationFamiliar configuration WertheimerWertheimer
Grouping CuesGrouping Cues
Grouping CuesGrouping Cues
Grouping CuesGrouping Cues
Grouping CuesGrouping Cues
Mid-Level Computer VisionMid-Level Computer Vision
• TechniquesTechniques– Clustering based on similarityClustering based on similarity– Limited work on other principlesLimited work on other principles
• ApplicationsApplications– Segmentation / groupingSegmentation / grouping– TrackingTracking
Snakes: Active ContoursSnakes: Active Contours
Contour Evolution forContour Evolution forSegmenting an ArterySegmenting an Artery
BirchfeldBirchfeld
HistogramsHistograms
Expectation Maximization Expectation Maximization (EM)(EM)
Color SegmentationColor Segmentation
Bayesian MethodsBayesian Methods
• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models
• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A
given model Bgiven model B
Bayesian MethodsBayesian Methods
• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models
• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A
given model Bgiven model B
• Bayes’s RuleBayes’s RuleP(B|A) = P(A|B) P(B|A) = P(A|B) P(B) / P(A) P(B) / P(A)
– Probability of model B given observation Probability of model B given observation AA
Thomas BayesThomas Bayes(c. 1702-1761)(c. 1702-1761)
Bayesian MethodsBayesian Methods
)|( aXP )|( aXP
)|( bXP )|( bXP
# black pixels# black pixels
# black pixels# black pixels
High-Level VisionHigh-Level Vision
• Human mechanisms: ???Human mechanisms: ???
High-Level VisionHigh-Level Vision
• Computational mechanismsComputational mechanisms– Bayesian networksBayesian networks– TemplatesTemplates– Linear subspace methodsLinear subspace methods– Kinematic modelsKinematic models
Cootes et al.Cootes et al.
Template-Based MethodsTemplate-Based Methods
Linear SubspacesLinear Subspaces
DataData
PCAPCA
New Basis VectorsNew Basis Vectors
Kirby et al.Kirby et al.
Principal Components Analysis Principal Components Analysis (PCA)(PCA)
Kinematic ModelsKinematic Models
• Optical Flow/Feature tracking: no constraints
• Layered Motion: rigid constraints
• Articulated: kinematic chain constraints
• Nonrigid: implicit / learned constraints
Real-world ApplicationsReal-world Applications
Osuna et al:
Real-world ApplicationsReal-world Applications
Osuna et al:
Course OutlineCourse Outline
• Image formation and captureImage formation and capture
• Filtering and feature detectionFiltering and feature detection
• Optical flow and trackingOptical flow and tracking
• Projective geometryProjective geometry
• Shape from XShape from X
• Segmentation and clusteringSegmentation and clustering
• RecognitionRecognition
• Applications: 3D scanning; image-based Applications: 3D scanning; image-based renderingrendering
3D Scanning3D Scanning
Image-Based Modeling and Image-Based Modeling and RenderingRendering
Debevec et al.Debevec et al.
ManexManex
Course MechanicsCourse Mechanics
• 60%: 4 written / programming assignments60%: 4 written / programming assignments
• 30%: Final group project30%: Final group project
• 10%: In-class participation (includes 10%: In-class participation (includes attendance, project presentation, etc.)attendance, project presentation, etc.)
Course MechanicsCourse Mechanics
• Book: Book: Computer Vision – A Modern Computer Vision – A Modern ApproachApproachDavid Forsyth and Jean PonceDavid Forsyth and Jean Ponce
• PapersPapers
• All online – available from class webpageAll online – available from class webpage
CS 496: Computer VisionCS 496: Computer Vision
• PersonnelPersonnel– Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz
[email protected]@cs.princeton.edu
– TA: Wagner CorrêaTA: Wagner Corrê[email protected]@cs.princeton.edu
– Email to bothEmail to [email protected]@princeton.edu
• Course web pageCourse web page
http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/