what should be done at the low level? 16-721: learning-based methods in vision a. efros, cmu, spring...
Post on 21-Dec-2015
218 Views
Preview:
TRANSCRIPT
What should be done at the Low Level?
16-721: Learning-Based Methods in VisionA. Efros, CMU, Spring 2009
Class Introductions
• Name:• Research area / project / advisor• What you want to learn in this class?• When I am not working, I ______________• Favorite fruit:
Analysis Projects / Presentations
Wed: Varun
note-taker: Dan
Next Wed: Dan
note-taker: Edward
Dan and Edward need to meet with me ASAP
Varun needs to meet second time
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Image- BasedProcessing
Surface- BasedProcessing
Object-Based
Processing
Category- BasedProcessing
Light
Vision
Audition
STM
LTM
Motor
Sound
LightMove-ment
Odor (etc.)
Ceramiccup on a table
David Marr, 1982
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
The Retinal Image
An Image (blowup) Receptor Output
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Image-basedRepresentation
Primal Sketch(Marr)
An Image
(Line Drawing)
RetinalImage
Image-based
processes
EdgesLinesBlobsetc.
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Surface-basedRepresentation
Primal Sketch 2.5-D Sketch
Image-basedRepresentation
Surface-based
processes
StereoShadingMotion
etc.
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Object-basedRepresentation
Object-based
processes
GroupingParsing
Completionetc.
Surface-basedRepresentation
2.5-D Sketch Volumetric Sketch
Four Stages of Visual PerceptionFour Stages of Visual Perception
© Stephen E. Palmer, 2002
Category-basedRepresentation
Category-based
processes
Pattern-Recognition
Spatial-description
Object-basedRepresentation
Volumetric Sketch Basic-level Category
Category: cup
Color: light-gray
Size: 6”
Location: table
Finding 3D structure in two-tone images requires distinguishing cast shadows, attached shadows, and areas of low reflectivity
The images do not contain this information a priori (at low level)
Cavanagh's argument
Marr's model (circa 1980) Cavanagh’s Model (circa 1990s)
Feedforward vs. feedback models
stimulusstimulus
2D shape
memory
3D shape
2½D sketch
Object
3D model
feedback
basic recognition with 2D primitives
reconstruction of shape from image features
object recognition by matching 3D models
primal sketch
A Classical View of Vision
Grouping /Segmentation
Figure/GroundOrganization
Object and Scene Recognition
pixels, features, edges, etc.Low-level
Mid-level
High-level
A Contemporary View of Vision
Figure/GroundOrganization
Grouping /Segmentation
Object and Scene Recognition
pixels, features, edges, etc.Low-level
Mid-level
High-level
But where we draw this line?
Question #1:What (if anything) should be done at the “Low-Level”?
N.B. I have already told you everything that is known. From now on, there
aren’t any answers.. Only questions…
Eye is not a photometer!
"Every light is a shade, compared to the higher lights, till you come to the sun; and every shade is a light, compared to the deeper shades, till you come to the night."
— John Ruskin, 1879
Question #1:What (if anything) should be done at the “Low-Level”?
i.e. What input stimulus should we be invariant to?
Invariant to:
• Brightness / Color changes?
small brightness / color changeslow-frequency changes
But one can be too invariant
Invariant to:
• Edge contrast / reversal?
I shouldn’t care what background I am on!
but be careful of exaggerating noise
Representation choices
Raw Pixels
Gradients:
Gradient Magnitude:
Thresholded gradients (edge + sign):
Thresholded gradient mag. (edges):
Spatial invariance
• Rotation, Translation, Scale• Yes, but not too much…
• In brain: complex cells – partial invariance
• In Comp. Vision: histogram-binning methods (SIFT, GIST, Shape Context, etc) or, equivalently, blurring (e.g. Geometric Blur) -- will discuss later
top related