cs 764 seminar in computer vision
DESCRIPTION
CS 764 Seminar in Computer Vision. Ramin Zabih. Fall 1998. Course mechanics. Meeting time will be Tue/Thu 11-12, here Starting a week from today Home page is now up www/CS764 Assignment: present one paper You’ll have a lot of freedom, but you need to talk to me in advance - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/1.jpg)
CORNELLUNIVERSITY
CS 764Seminar in Computer Vision
Ramin Zabih
Fall 1998
![Page 2: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/2.jpg)
2
CORNELLUNIVERSITY
Course mechanics
Meeting time will be Tue/Thu 11-12, here• Starting a week from today
Home page is now upwww/CS764
Assignment: present one paper• You’ll have a lot of freedom, but you need to
talk to me in advance• Some possible papers will be posted shortly
![Page 3: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/3.jpg)
3
CORNELLUNIVERSITY
Topic of this seminar
The use of “knowledge” in the analysis of visual data• Sometimes called “context”
Clearly this is vital• On both psychological and technical grounds• But how? No one has much of an idea…
What is the interface between reasoning and perception? (Or, mind and body?)
![Page 4: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/4.jpg)
4
CORNELLUNIVERSITY
What is the visual system’s “contract”
Two standard (bad) answers Answer 1: describe the scene in terms of
surfaces [low-level vision]• There is a green patch 2” wide 1’ away
Answer 2: describe the scene in terms of objects [model-based recognition]• Start with a set of 3D models (modelbase)• Determine position and pose
![Page 5: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/5.jpg)
5
CORNELLUNIVERSITY
Why are these answers wrong?
They are almost purely data-driven• Bottom-up (from the data) versus top-down (from
somewhere else) They report “objective fact”, with no room for
the task at hand• For a given image, there is only one right answer
Other problems as well• Not very useful, etc.
![Page 6: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/6.jpg)
6
CORNELLUNIVERSITY
Technical and psychological arguments
There are technical arguments against this• Vision is an inverse problem
– Many 3D scenes could explain a single 2D image
• On engineering grounds, this makes no sense– Ultimately, perception is used for some task
The human perceptual system has both top-down and bottom-up elements• Various optical illusions
– Two people can look at the same picture and see something completely different
![Page 7: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/7.jpg)
![Page 8: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/8.jpg)
![Page 9: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/9.jpg)
![Page 10: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/10.jpg)
10
CORNELLUNIVERSITY
Your vision system doesn’t listen
![Page 11: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/11.jpg)
11
CORNELLUNIVERSITY
It makes “reasonable” assumptions
![Page 12: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/12.jpg)
12
CORNELLUNIVERSITY
Low-level vision has its solution
Inverse problems require assumptions The assumptions for low-level vision are extremely general (I.e., weak)• Reflect the physics of the visible world• For example, motion or depth or intensity tend
to be “coherent”– Saying that every pixel is moving differently from its
neighbors is a very unlikely answer– The world we live in tends not to do that– Helmholtz’s “unconscious inference”
![Page 13: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/13.jpg)
13
CORNELLUNIVERSITY
We’ll need high-level vision
Most of the field is low-level vision or model-based recognition• Partly to avoid the confusion CS764 is about
Key question: how to avoid brittleness?• Can make the visual system compute just what we
need for our task (I.e., berries)• But how to handle the unexpected (I.e., lions)?
![Page 14: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/14.jpg)
14
CORNELLUNIVERSITY
A short historical perspective
1960’s vision was completely task-specific• A black blob in the center of the image is a
telephone• These efforts are now considered “hacks”
1970’s vision became completely general• Marr pushed the field towards precise technical
questions• Low-level vision and recognition became
dominant
![Page 15: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/15.jpg)
15
CORNELLUNIVERSITY
Tasks strike back
In the mid-1980’s, several attempts were made to re-introduce a notion of task• Active/animate/purposive vision
These attempts are widely viewed as failures, for good reasons• We’ll look at them a bit next week
It’s not enough to have good intuitions• There needs to be technical merit as well
![Page 16: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/16.jpg)
16
CORNELLUNIVERSITY
Desiderata
Technical solutions (algorithms) that are very roughly consistent with human data• Goal is not AI, psychology or philosophy
Provide visual summaries useful for tasks, but degrade gracefully• Handle open/unstructured environments• Deal with expectations and breakdown
![Page 17: CS 764 Seminar in Computer Vision](https://reader035.vdocument.in/reader035/viewer/2022072014/56812c90550346895d913e70/html5/thumbnails/17.jpg)
17
CORNELLUNIVERSITY
Our path for 764
No good computational work to read• Perhaps Vera will fix this?
We will examine papers along these lines:• Computational approaches that failed• Psychological data that is highly suggestive• Neurologically inspired architectures• Cognitive scientists and philosophers
– Their goal is argument, not algorithm!
– They’ve thought the most about these issues