vision in man and machine. stats 19 sem 2. 263057202. talk 2. alan l. yuille. ucla. dept. statistics...

Post on 15-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Vision in Man and MachineVision in Man and Machine..STATS 19 SEM 2. 263057202. Talk 2.STATS 19 SEM 2. 263057202. Talk 2.

Alan L. Yuille.

UCLA. Dept. Statistics and Psychology.

www.stat.ucla/~yuille

The Purpose of Vision.The Purpose of Vision.

“To Know What is Where by Looking”. Aristotle. (384-322 BC).

Information Processing: receive a signal by light rays and decode its information.

Vision appears deceptively simple, but there is more to Vision than meets the Eye.

Ames RoomAmes Room

Perspective.Perspective.

Curved Lines?Curved Lines?

Brightness of Patterns: Adelson (MIT)Brightness of Patterns: Adelson (MIT)

Visual IllusionsVisual Illusions

The perception of brightness of a surface, or the length of a line, depends on context. Not on basic measurements like:the no. of photons that reach the eyeor the length of line in the image..

Perception as InferencePerception as Inference

Helmholtz. 1821-1894.“Perception as Unconscious Inference”.

Ball in a Box. (D. Kersten)Ball in a Box. (D. Kersten)

How Hard is Vision?How Hard is Vision?

The Human Brain devotes an enormous amount of resources to vision.

(I) Optic nerve is the biggest nerve in the body. (II) Roughly half of the neurons in the cortex are

involved in vision (van Essen). If intelligence is proportional to neural activity,

then vision requires more intelligence than mathematics or chess.

Vision and the BrainVision and the Brain

Half the Cortex does VisionHalf the Cortex does Vision

Vision and Artificial IntelligenceVision and Artificial Intelligence

The hardness of vision became clearer when

the Artificial Intelligence community tried to

design computer programs to do vision. ’60s.AI workers thought that vision was “low-

level” and easy. Prof. Marvin Minsky (pioneer of AI) asked

a student to solve vision as a summer project.

Chess and Face DetectionChess and Face Detection

Artificial Intelligence Community preferred Chess to Vision.

By the mid-90’s Chess programs could beat the world champion Kasparov.

But computers could not find faces in images.

Man and Machine.Man and Machine.

David Marr (1945-1980) Three Levels of explanation:

1. Computation Level/Information Processing

2. Algorithmic Level

3. Hardware: Neurons versus silicon chips.

Claim: Man and Machine are similar at Level 1.

Vision: Decoding ImagesVision: Decoding Images

Vision as Probabilistic Inference Vision as Probabilistic Inference

Represent the World by S.Represent the Image by I.Goal: decode I and infer S.Model image formation by likelihood

function, generative model, P(I|S)Model our knowledge of the world by a

prior P(S).

Bayes TheoremBayes Theorem

Then Bayes’ Theorem states we show infer the world S from I by

P(S|I) = P(I|S)P(S)/P(I).Rev. T. Bayes. 1702-1761

Bayes to Infer S from IBayes to Infer S from I

P(I|S) likelihood function . P(S) prior.

.

Technically very interdisciplinaryTechnically very interdisciplinary

But applying Bayes is not straightforward.A beautiful theory is being developed

adapting techniques from Computer Science, Engineering, Mathematics, Physics, and Statistics.

E.G. Probabilistic Reasoning (Pearl CS),

Level Sets (Osher Maths).

ExamplesExamples

Generative Models Visual Inference:

(1) Estimating Shape.

(2) Segmenting Images.

(3) Detecting Faces.

(4) Detecting and Reading Text.

Generative ModelsGenerative Models

Learn Generative Models from a fewimages and then generate new images.

Uses of Generative ModelsUses of Generative Models

Univ. Oxford

Shape Inference: (Zhu Lab)Shape Inference: (Zhu Lab)

Shape and Photometry ( Soatto Lab)Shape and Photometry ( Soatto Lab)

– Estimate geometry (shape) and photometry from multiple images.

Jin-Soatto-Yezzi

Compare ground truth (Soatto Lab)Compare ground truth (Soatto Lab)

Jin-Soatto-Yezzi 11/1/02

Estimated shapeEstimated shape

Alternative algorithmAlternative algorithm

Ground truthGround truth

Generated Image:synthesized from novelviewpoint and illumination.

Jin-Soatto-Yezzi 11/1/02

Ground Truth:

same lighting and viewpoint

Compare w. ground truth (Soatto Lab)Compare w. ground truth (Soatto Lab)

Segmentation (Level Sets)Segmentation (Level Sets)

Segmentation (Level Sets)Segmentation (Level Sets)

Segmenting Images (Zhu Lab)Segmenting Images (Zhu Lab)

Characterize the set of image patterns that

occur in natural images. Provide mathematical models. P(I|S) and P(S).

Face and Text Detection.Face and Text Detection.

Back to the BrainBack to the Brain

Top-Level; compare human performance to

Ideal Observers.

Explain human perceptual biases (visual

illusions) as strategies that are “statistical

effective”.

Brain Architecture Brain Architecture

The Bayesian models have interesting

analogies to the brain. Generative Models require top-down

processing

High-Level Tells Low-Level to High-Level Tells Low-Level to Shut Up (Kersten Lab)Shut Up (Kersten Lab)

High-Level Tells Low-Level to High-Level Tells Low-Level to Shut up (Kersten Lab)Shut up (Kersten Lab)

ConclusionConclusion

Vision is unconscious inference. Theory of Vision for Man and Machine.

See more about Vision at UCLA in the Vision and Image Science Collective

http://visciences.ucla.edu

top related