my group’s current research on image understanding

My Group’s Current Research on Image Understanding

An image-understanding task

Low-level vision

Color, Shape, Texture

Low-level vision


Simple SegmentationLow-level vision



Object recognition



Object recognition

High-level perception



Object recognition


Pattern recognition



Object recognition


Pattern recognition

Analogy-making



Object recognition


Pattern recognition

“Meaning”

Analogy-making



Object recognition


??? Pattern recognition

“Meaning”

Analogy-making



Object recognition


Pattern recognition

“Meaning”

Analogy-making

The “SEMANTIC

GAP’



Object recognition


Pattern recognition

“Meaning”

Analogy-making

HMAX model of visual cortexRiesenhuber, Poggio, et al.

The “SEMANTIC

GAP’



Object recognition


Pattern recognition

“Meaning”

Analogy-making

Active Symbol Architecturefor high-level perceptionHofstadter et al.

HMAX model of visual cortexRiesenhuber, Poggio, et al.

The “SEMANTIC

GAP’

The HMAX model for object recognition(Riesenhuber, Poggio, Serre, et al.)

1. Densely tile the image withwindows of different sizes.

2. HMAX features are computed in each window.

3. The features in eachwindow are given as inputto the trained support vector machine.

4. If the SVM returns a score above a learned threshold, then the object is said to be “detected” .

…

…

Recognition Phase

Streetscenes “scene understanding” system(Bileschi, 2006)

Object detection (here, “car”) with HMAX model (Bileschi, 2006)

Some limitations of the Streetscenes approach to scene understanding


• Requires exhaustive search for object identification and localization



Exhaustive search over:




• Window size and location in the image





• Object categories (e.g., car, pedestrian, tree, etc.)





• Object categories (e.g., car, pedestrian, tree, etc.)

Exhaustive use of HMAX features in each window

• Does not recognize spatial and abstract relationships among objects for whole scene understanding


• Has no prior knowledge about object categories and their place in “conceptual space”


• Has no prior knowledge about object categories and their place in “conceptual space”

• HMAX model is completely feed-forward; no feedback to allow context to aid in scene understanding.

Goal of our project

• Perform whole-scene interpretation without exhaustive search.

– Incorporate conceptual knowledge

– Allow feedforward and feedback modes to interact

Person Dog

leash attached to

walking

actionaction

holds

A Simple Semantic Network (or “Ontology”)

“Dog walking”

But...

http://www.dogasaur.com/blog/wp-content/uploads/2011/04/dogwalker.jpg

But...

http://www.vet.k-state.edu/depts/development/lifelines/images/dog_jog_1435.jpg

Person Dog

leash attached to

walking

actionaction

holds Dog Group

running

“Dog walking”

Person Dog

leash attached to

walking

actionaction

holds

running

Allowing “conceptual slippage”

“Dog walking”

Dog Group

But...

http://3.bp.blogspot.com/_1YuoCTv4oKQ/S71jUDm7kOI/AAAAAAAAAak/jz4Pg7zzzQ8/s1600/23743577.JPG

http://lh3.ggpht.com/-ZZrYWeBFTjo/SFQH_0ijwaI/AAAAAAAABjA/8nwryW2BmEw/IMG_0356.JPG

Person

leash attached to

walking

actionaction

holds

“Dog walking”

running

Cat

Iguana

Dog

Dog Group

Tail

But...

http://www.mileanhour.com/post/Dog-walking-bike.aspx

http://cl.jroo.me/z3/Z/e/C/d/a.aaa-Thus-walking-dog.png

ttp://thedaemon.com/images/DARPA_Segue_Dog.jpg

http://www.bikeforest.com/product45422.jpg

http://www.k9ring.com/blog/image.axd?picture=2010%2F3%2Fwalking_dog_from_car.jpg

http://www.guy-sports.com/fun_pictures/dog_walking_helicopter.jpg

http://static.themetapicture.com/media/funny-dog-walking-horse-leash.jpg

http://macwetblog.files.wordpress.com/2012/05/dog-walking.jpg

Person Dog

leash attached to

walking

actionaction

holds

running

Cat

Iguana

Biking

Car

Helicopter

“Dog walking”

Dog Group

Driving

Segue-ingTreadmill-ing Horse

Tail

Active Symbol Architecture(Hofstadter et al., 1995)


• Basis for – Copycat (analogy-making), Hofstadter & Mitchell

– Tabletop (anlaogy-making), Hofstadter & French

– Metacat (analogy-making and self-awareness),

Hofstadter & Marshall

and many others…

Semantic network

Temperature

Workspace


Perceptual agents (codelets)are “active symbols”

Petacat:

(Descendant of Copycat, part of the PetaVision project)

Integration of Active Symbol Architecture and HMAX

Initial task:

Decide if image is an instance of “taking a dog for a walk”, and if so, how good an instance it is.

Workspace

Semantic network

Workspace

taking a dog for a walk

outdoors

has location

persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

horse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component

has componenthas component

stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is onSpatial

Relation

Semantic Network

cat

Property links

Slip links


outdoors

has location

persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

horse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component


stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is onSpatial

Relation

Semantic Network

cat

Semantic Network


outdoors

has location

persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

horse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component


stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is onSpatial

Relation

cat


outdoors

has location

persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

horse

swims

ropebelt

leashsidewalk

string

walkswalks

is in front of

has location

has action

has component


stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

is onSpatial

Relation

cat

taking a dog for a walkhas location

persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component


stands

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

is on

Spatial Relation


persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

horse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component


stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

cat


persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component


stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

• Measures how well organized the program’s “understanding” is as processing proceeds

– Little organization high temperature

– Lots of organization low temperature

• Temperature feeds back to affect perceptual agents:

– High temperature low confidence in decisions decisions are made more randomly

– Low temperature high confidence in decisions decisions are made more deterministically

Temperature

Input image

Input image Weak segmentation


Location “heat map”(probability distribution over pixel locations)_

++++

+


Location “heat map”(probability distribution over pixel locations)_

++++

+

Scale “heat map”(probability distribution over scales at each pixel location)

Dog?

Scout codelets: Send C1 features in window to corresponding SVM.If positive result, post builder codelet with urgency equal to SVM’sconfidence.

Dog? Dog?

Person?


Dog? Dog?

Sidewalk?

Person?


Dog? Dog?

Sidewalk?

Person?

Dog?

Outdoors?


Dog?negative Dog?

negative

Sidewalk?positive: 0.4

Person?negative

Outdoors?positive: 0.7


Dog?positive: 0.8

Builder codelets: Ask HMAX to compute C2 features using prototype shapesspecific to the object class, and send them to corresponding SVM. If positive, decide to build structure with probability equal to SVM confidence. Break competing structures if necessary.

Dog?negative Dog?

negative

Sidewalk?positive: 0.4

Person?negative

Outdoors?positive: 0.7

Dog?positive: 0.8

Outdoors

Dog


persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component


stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Object-specific heat maps are updated.

+

Dog

Person heat map

+


+

Dog

Person heat map

+

Dog

Person?Person?


As codelets build structure, heat maps

are continually updated to reflect prior

(learned) expectations about location

and scale as a function of location and

scale of “built” objects.

+

Dog

+

Person heat map

Person?Person?

Dog? Dog

Leash?

OutdoorsLeash?

Sidewalk?

Person?

Person?

Dog

PersonStrength: 0.75

Outdoors

Sidewalk

PersonStrength: 0.6

Dog

PersonOutdoors

Sidewalk


persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component


stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Dog

PersonOutdoors

Sidewalk

Leash?

Leash?

Dog?

Sidewalk?

Dog?

Rope?

Dog

PersonOutdoors

Sidewalk

Leash

Dog(weak)

Dog

PersonOutdoors

Sidewalk

Leash

Dog(weak)

Dog(strong)

Dog

PersonOutdoors

Sidewalk

Leash

Dog


persondog

has action

is on

is touching

has component

aroad

abeach

trail

drives

runsflies

cathorse

swims

ropebelt

leash

string

walkswalks

is in front of

has location

has action

has component


stands

is on

sits

is in front of

is touching

is behind

is next to

is on

agrass

is touching

Object

Action

indoors

sidewalk

outdoors

Spatial Relation

Dog

PersonOutdoors

Sidewalk

Leash

Dog

Once objects begin to be built, relation and grouping codelets can run on them.

is next to

is next to

Dog group


Dog

PersonOutdoors

Sidewalk

Dog

is next to

is next to

Dog group

Leash

Dog

PersonOutdoors

Sidewalk

Dog

is next to

is next to

Dog group

is next to

Leash


How Petacat makes a final decision

Temperature


Dog

PersonOutdoorsLeash

Dog

is next to

is next to

Dog group Sidewalk

is next to

How Petacat makes a final decision

Temperature


Dog

PersonOutdoorsLeash

Dog

is next to

is next to

Dog group Sidewalk

“Situation” codelet is more likely to run when temperature is low.

is next to

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

is next to

Sidewalk

Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

Sidewalk

person


leash

dog

outdoors

is next to


has component

has location

is in front of

Situation codelet tries to match prototypical situation with existing workspace structures, possibly allowing slippages.

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

person


leash

dog

outdoors

is next to


has component

has location

is in front of

is next toDog group

Sidewalk

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group

person


leash

dog

outdoors

is next to


has component

has location

is in front of

is next toDog group

If resulting temperature is low enough, classify scene as positive

Sidewalk

Dog

PersonOutdoors

Leash

Dog

is next to

is next to

Dog group Sidewalk

person


leash

dog

outdoors

is next to


has component

has location

is in front of

is next toDog group

If situation codelet fails enough times or does not run for a long time,program has increasing chance of ending with negative classification.

If resulting temperature is low enough, classify scene as positive

Temperature at the end of the run gives a measure of how good an instance the picture is (e.g., of the “dog walking” situation).

Temperature

my group’s current research on image understanding

Documents

tasklowlevel visioncolor

texturelowlevel visioncolor

localization exhaustive

object identification

streetscenes approach

image object categories

object recognitionriesenhuber

object detection