raciocínio e percepção espacial: uma abordagem lógicapsantos/slidesmc5.pdf · raciocínio e...

Raciocínio e Percepção Espacial: Uma abordagemlógica

Paulo Santos

FEI - São Paulo

September 10, 2010

Paulo Santos ( FEI - São Paulo ) September 10, 2010 1 / 136

Outline I1 Preface

Where/what is FEI?

2 PART I: The Big PictureIntroduction and motivationAutomated Reasoning 101Qualitative Spatial Reasoning at a glance

3 Part II: Qualitative Spatial ReasoningRegion Connection CalculusLines of Sight CalculusRegion Occlusion CalculusCardinal Direction CalculusDouble Cross CalculusOther calculiTractability and computability

4 Coffee Break20 min ?

Outline II5 Part III: Cognitive Vision

FoundationsCognitive Vision at a glanceEarly systemsModern systems (from 2000)

6 Part IV: QSR in CogVisDepth Profile CalculusReasoning about Shadows in RoboticsThe future: Probabilistic Logic Encoding of Spatial Domains

7 Conclusion

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Where is FEI?

S. Paulo, SPFEI Campus

FEI is the largest engineering school in Brazil, with over 8,000studentsit is already a regional centre of scientific development for theautomotive industry and started investing intensively to becomealso a regional centre for intelligent robotics.

Where is FEI?

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Introduction and motivation

reasoning about space is ubiquitous: it is necessary in situationsfrom tying a shoe-lace to urban traffic navigation;automation of spatial reasoning has led to the development of anumber of application domains:

I geographical information systems (GIS)I roboticsI commonsense reasoningI natural language processingI virtual world modelling and animationI medical analysis and diagnosis systemsI computer vision

reasoning about space is ubiquitous: it is necessary in situationsfrom tying a shoe-lace to urban traffic navigation;automation of spatial reasoning has led to the development of anumber of application domains:

I geographical information systems (GIS)I roboticsI commonsense reasoningI natural language processingI virtual world modelling and animationI medical analysis and diagnosis systemsI computer vision

We acquire knowledge about spatial relationships mainly in two ways:sensory processing: intensively studied in Computer Vision andRobotics;being told, or reading, about spatial arrangements (high-levelreasoning). This is the kind of information processing we’reconcerned about here.

In other words, we’ll be talking about qualitative reasoning, in contrastto numerical processing.We will also present computer vision systems whose aim is to bridgethe gap between sensory processing and high-level reasoning:Cognitive Vision systems.

Aim and inspiration for this tutorial

We follow the believe expressed in Takeo Kanade KeynoteLecture (given at IJCAI 2003): it is now the time to combineComputer Vision with Relational models and Reasoning.This tutorial presents:

I tools and methodology of QSR;I an overview of major QSR calculi;I an overview of Cognitive Vision Systems;I examples of QSR systems for Cognitive Vision.

Aim: make a brief overview of these areas, presenting the contextand foundations to kick start new projects.

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Why a “logic-based approach”?

has a semanticslarge variety of distinct logics for different kinds of reasoningvariety of inference mechanismsprovides a tool kit from which it is possible to characterise thecalculideveloping relational calculi is well understood

Logical Reasoning

Every man is mortal.Socrates is a man.Ergo: Socrates is mortal.

Logical Reasoning

∀x m(x)→ mo(x).

m(Socrates).Ergo: mo(Socrates).

MODUS PONENS: A→B, AB

Logical Reasoning

∀x m(x)→ mo(x).

m(Socrates).Ergo: mo(Socrates).

MODUS PONENS: A→B, AB

Logic programming

m(X ) :− mo(X ).

m(s).?- m(X )

?- X = s

Computational processes: resolution, model checking, ...

Logic programming

m(X ) :− mo(X ).

m(s).?- m(X )

?- X = s

Logic programming

m(X ) :− mo(X ).

m(s).?- m(X )

?- X = s

Logic programming

m(X ) :− mo(X ).

m(s).?- m(X )

?- X = s

Logical Reasoning

Deduction: Inferring logical truthsI All the beans from this bag are white.I These beans are from this bag.I Ergo, these beans are white. (result)I Prolog, Otter, Spass, and much logic programming systems.

Abduction: jumping to conclusionsI I took one bean from this bag and it is whiteI Ergo, all the beans from this bag are white.I ACLIP, CIFF, ProLogICA, ...

Induction: generalising from examplesI one bean from this bag is white;I another bean from this bag is white;I another bean from this bag is white;I ...I Ergo, all the beans from this bag are white.I Progol, HR, Claudien, FOIL, ...

Logical Reasoning

Knowledge Representation and Reasoning

the logic formalisation of reasoning processes, capable of inferringknowledge from representations of the world;the construction of a medium for efficient computation, in whichthe formal representation provides the means to organisedomain knowledge allowing for efficient (and consistent)queries, updates and revisions of the knowledge base;the rigorous treatment of ontological commitments, which providethe base rules that guide reasoning about the world. For instance,

I what should or should not be considered as the effects of actionsI nature of knowledge about temporal entitiesI belief changeI vaguenessI spatial entitiesI Davis et al. “what is Knowledge Representation?”, AI Magazine vol

14, 1993

Outline I1 Preface

Where/what is FEI?

7 Conclusion

What is QSR?

The formal representation of (qualitative) spatial knowledge in terms ofsome basic entities and primitive relations in order to allow meaningfuland, sometimes, efficient inference methods about space.

Spatial Reasoning

“The basic stories we know best are small stories of events in space:The wind blows clouds through the sky, a child throws a rock, a motherpours milk into a glass, a whale swims through the water. Thesestories constitute our world.” (M.Turner, The Literary Mind)

Philosophical origins of QSR

the fundamental principles of Geometry were first investigated inancient Greece by Thales circa 600 B.C.)the laws of valid argument in terms of logical modes of inferencewere studied separately by early Greek philosophers:analytic geometry (Descartes 1637)19th century revolution on spatial reasoning:

I Non-euclidean geometries (e.g. Lobachevsky’s hyperbolicgeometry (1829)

I Cantor’s point-set topology (1845-1918):I Poincaré’s algebraic topology

Philosophical origins of QSR: 20th century

Bertrand Russell and Alfred Whitehead: principles of logic tophenomenological theories, describing the world as it is perceivedthrough sense data;Whitehead (1920): a theory of the perceived world should have asbasic entities the very ’phenomena’: integral objects or events

I geometry becomes concerned with relationships between regionsoccupied by bodies and dynamical laws with qualitative rules aboutworld events

Origins of QSR in Artificial Intelligence

phenomenological theories: closer to human reasoningwould it be possible to automate them ?at least for some particular domains?the construction of formal theories about the qualitativerelationships between basic spatial (“phenomenological”) theoriesis the main goal of qualitative spatial reasoning in AI.

but perhaps we should talk about time before talking aboutspace...

Allen’s interval calculus

Figure: adapted from M. Ragni, Reasoning in Dynamic Environments, KI2006

13 relationsany two intervals expressed by one and only one relation (JEPD)temporal reasoning: derive facts about temporal intervals

Allen’s interval calculus: applications

represent activities and temporal knowledgeplanning and schedulingtemporal databasesnatural language understanding

Tools of temporal reasoning

Conceptual Neighbourhood Diagrams: graphs representing intheir vertices relations on some specific objects; whereas theiredges represent continuous transitions between these relations.

I Continuous transitions: in between adjacent vertices of the graphthere is no other relations that the entities in the relation’sarguments can assume.

Composition (or transitivity) tables: given two relations on anyobjects a, b, ad c (e.g., R1(a,b) and R2(b, c)), the compositiontable entry for R1(a,b) and R2(b, c) gives the minimal set ofdisjunctions R3(b, c) of the possible relations between a and c.

Conceptual Neighbourhood Diagram: illustration

Figure: adapted from M. Ragni, Reasoning in Dynamic Environments, KI2006

Transitivity table

< > d di ...before (<) < no info <; o; m; d;

s< ...

after (>) no info > >; oi; mi;d; f

during (d) < > no info. d ...contains(di)

<; o; m; di;fi

>; oi; di;mi; si

o; oi; dur;con; =

di ...

overlaps(o)

< >; oi; di;mi; si

o; d; s <; o; m ...

overlappedby (oi)

<; o; m; di;fi

> oi; d; f >; oi; mi;di;si

meets (m) < >; oi; mi;di; si

o; d; s < ...

met-by(mi)

<; o; m; di;fi

> d <; o; m; di;fi

Basic elements in a spatial theory

space: absolute or relative; global or local;basic entities: points, regions, directions, bodies, shapes, things,sense-data, ...primitive relations: meet, between, connect, part-of, ...

I the set of relations is usually Jointly Exhaustive and PairwiseDisjoints (JEPD)

formal tools: axiomatic (deriving axioms and proving theorems),algebraic (encode knowledge with operators and equations),purely logical (design of a spatial logic)

Tools of QSR

Conceptual Neighbourhood DiagramsComposition (or transitivity) tables

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Region Connection Calculus (RCC)

many sorted, first order logic axiomatisation of spatial regionsbased on a primitive binary relation about the connection betweentwo regions (C/2).C(x , y) (“x is connected to y”), i.e., the topological closures of xand y share at least one point

∀x C(x , x);∀xy C(x , y)→ C(y , x);∀xyz (C(z, x)↔ C(z, y))→ x = y .

Region Connection Calculus

P(x , y), “x is part of y”;O(x , y), “x overlaps y”;PP(x , y), “x is proper part of y”;Pi/2 and PPi/2 are inverse relations of P/2 and PP/2, resp;DC(x , y), “x is disconnected from y”;EQ(x , y), “x is equal to y”;PO(x , y), “x partially overlaps y”;EC(x , y), “x is externally connected to y”;TPP(x , y), “x is tangencial proper part of y”;NTPP(x , y), “x is non-tangential proper part of y”;TPPi/2 NTPPi/2 are inverses of TPP/2 and NTPP/2

{DC(x , y), EQ(x , y), PO(x , y), EC(x , y), TPP(x , y), NTPP(x , y),TPPi/2, NTPPi/2 } is a JEPD set known as RCC8.

RCC: conceptual neighbourhood diagram

DC EC PO

EQNTPP

Region Connection Calculus

((DC(x , y)↔ ¬C(x , y))). (1)((P(x , y)↔ (∀z(C(z, x)→ C(z, y))))). (2)((PP(x , y)↔ (P(x , y) ∧ ¬P(y , x)))). (3)((EQ(x , y)↔ (P(x , y) ∧ P(y , x)))). (4)((O(x , y)↔ (∃z(P(z, x) ∧ P(z, y))))). (5)((PO(x , y)↔ (O(x , y) ∧ ¬P(x , y) ∧ ¬P(y , x)))). (6)((DR(x , y)↔ −O(x , y))). (7)((EC(x , y)↔ (C(x , y) ∧ ¬O(x , y)))). (8)((TPP(x , y)↔ (PP(x , y) ∧ (∃z(EC(z, x) ∧ EC(z, y)))))). (9)((NTPP(x , y)↔ (PP(x , y) ∧ ¬(∃z(EC(z, x) ∧ EC(z, y)))))). (10)((Pi(x , y)↔ P(y , x))). (11)((PPi(x , y)↔ PP(y , x))). (12)((TPPi(x , y)↔ TPP(y , x))). (13)((NTPPi(x , y)↔ NTPP(y , x))). (14)Paulo Santos ( FEI - São Paulo ) September 10, 2010 39 / 136

RCC: Transitivity table

DC EC PO TPP ...DC no info. DR,PO,

PPDR,PO,PP

DR,PO,PP ...

EC DR,PO,PPi

DR,PO,TPP

DR,PO,PP

EC,PO,PP

PO DR,PO,PPi

DR,PO,PPi

no info. PO,PP ...

TPP DC DR DR,PO,PP

PP ...

NTPP DC DC DR,PO,PP

NTPP ...

TPPi DR,PO,PPi

EC,PO,PPi

PO,PPi PO,TPP ...

NTPPi DR,PO,PPi

PO,PPi PO,PPi PO,PPi ...

EQ DC EC PO TPP ...

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Lines of Sight Calculus

Represent the relative positions between pairs of (non-overlapping)convex bodies

Figure: Randell et a. From Images to Bodies: Modelling and Exploiting SpatialOcclusion and Motion Parallax. IJCAI 2001

14 relations:I C(x , y), is clear from;I JC(x , y), is just clear from;I PH(x , y), partially hides;I PHI(x , y), is partially hidden by;I JH(x , y), just hides;I JHI(x , y), is just hidden;I H(x , y), hides;I HI(x , y), is hidden by;I EH(x , y), exactly hides;I EHI(x , y), is exactly hidden;I F (x , y), is in front of;I FI(x , y), has y in front of it;I JF (x , y), is just in front of;I JFI(x , y), has y just in front of it;

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Region Occlusion Calculus (ROC)

Region Occlusion Calculus (ROC) is an extension of RCC torepresent the various possibilities of interposition (occlusion)between arbitrary shaped bodies.Two functions: region and image.

I region: maps a physical body to is occupancy region.I image maps a physical body to its bi-dimensional projection from a

particular viewpoint.

Region Occlusion Calculus (ROC)

ROC primitive relations: C/2 and TotallyOccludes(x , y , ν) (“body xtotally ocludes body y from the viewpoint ν”).

∀x∀ν¬TotallyOccludes(x , x , ν)∀x∀y∀z∀νTotallyOccludes(x , y , ν) ∧ TotallyOccludes(y , z, ν))→

TotallyOccludes(x , z, ν)

Region Occlusion Calculus

The following axioms introduce RCC8 in ROC:.

((TotallyOccludes(x , y , ν) ∧ P(region(z), region(y)))→TotallyOccludes(x , z, ν))

(TotallyOccludes(x , y , ν)→ ∀z(P(region(z), region(y)))→¬TotallyOccludes(z, x , ν))

(TotallyOccludes(x , y , ν)→ ∀z∀u(P(region(z), region(x))∧P(region(u), region(y)))→ ¬TotallyOccludes(u, z, ν))

∃y∃z(P(region(y), region(x)) ∧ P(region(z), region(x))∧TotallyOccludes(y , z, ν))

(TotallyOccludes(x , y , ν)→ P(image(y , ν), image(x , ν)))

We can introduce a weaker notion of occlusion Occludes/3:

Occludes(x , y , ν)↔ ∃z∃u(P(region(z), region(x))

∧ P(region(u), region(y))∧TotallyOccludes(z,u, ν))

Non occlusion, partial occlusion, mutual occlusion:

PartiallyOccludes(x , y , ν)↔ Occludes(x , y , ν)∧¬TotallyOccludes(x , y , ν) ∧ ¬Occludes(y , x , ν)

MutuallyOccludes(x , y , ν)↔ Occludes(x , y , ν) ∧Occludes(y , x , ν)NonOccludes(x , y , ν)↔ ¬Occludes(x , y , ν) ∧ ¬Occludes(y , x , ν)NonOccludes(x , y , ν)→ DR(image(x , ν), image(y , ν))PartiallyOccludes(x , y , ν)→

(PO(image(x , ν), image(y , ν)) ∨ PP(image(x , ν), image(y , ν)))MutuallyOccludes(x , y , ν)→

(PO(image(x , ν), image(y , ν)) ∨ P(image(x , ν), image(y , ν))∨PI(image(x , ν), image(y , ν)))

MutuallyOccludesPO

NonOccludesEC

NonOccludesDC

TotallyOccludesTPPI

PartiallyOccludesPO!1

PartiallyOccludesTPP

TotallyOccludesTPPI !1

TotallyOccludesEQ

PartiallyOccludesPO

TotallyOccludesNTPPI

TotallyOccludesNTPPI !1

PartiallyOccludesTPP!1

MutuallyOccludesTPP !1

TotallyOccludesEQ

PartiallyOccludesNTPP

MutuallyOccludesEQ

MutuallyOccludesNTPP

MutuallyOccludesTPP

MutuallyOccludesNTPP!1

!1PartiallyOccludesNTPP

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Cardinal Direction Calculus

Cardinal Direction Calculus (CDC) is a formalism for reasoningabout the directions between spatial objects9 base relations: north, south, east , west , northeast , northwest ,southeast , southwest e EQ (EQ(x , y) means “x is at the samedirection as y”).Main goal of CDC is to infer facts about the relative direction oftwo objects A and B from the known directions between A and C(A 6= C and B 6= C).

I E.g., from north(A,B) and northeast(B,C), the task is to calculatethe possible directions between A e C.

Cardinal Direction Calculus (CDC) is a formalism for reasoningabout the directions between spatial objects9 base relations: north, south, east , west , northeast , northwest ,southeast , southwest e EQ (EQ(x , y) means “x is at the samedirection as y”).Main goal of CDC is to infer facts about the relative direction oftwo objects A and B from the known directions between A and C(A 6= C and B 6= C).

I E.g., from north(A,B) and northeast(B,C), the task is to calculatethe possible directions between A e C.

Figure: Adapted from SparQ User Manual v0.7Paulo Santos ( FEI - São Paulo ) September 10, 2010 57 / 136

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Double Cross Calculus

is a calculus that defines the direction of a point with respect to adirected line segment15 ternary relations on pointsrepresents every distinct relation between the directions left-rightand front-back (e.g.left-front, left-back, left-line, left -perpendicular,straight-front, ...)motivation: qualitative description of paths

Figure: Adapted from SparQ User Manual v0.7

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Other Calculi

size and distance: are defined either in absolute scale (whereasthe relations < or >) are introduced in the usual way), or inrelative terms, where the relative connection between threeobjects is used to define the relations of proximity (near than, farthan) and equidistance . Size and distance calculi are usuallycoupled with other calculi to extend their expressivity,shapes: one of the least understood areas of QSR. In general,shape is defined by means of a number of primitives such asinterior or boundary.

Other Calculi

default: not much has been done wrt default theories aboutspace; Shanahan formalises a pre-condition about spatialoccupancy of objects assuming that space is empty by default.spatial change: base to the development of spatial change is thework of Galton, where both time instants and intervals areincluded. Two predicates are used to this end: HoldsT representsa spatial state that is true at a time instant, whereas HoldsIrepresents true stated during a time interval. From this, 8 distinctkinds of transitions between pairs of states are defined in order torepresent the relation of two states in time.Qualitative Trajectory Calculi, Line Segments, Dipole Calculi, andmany others!

toolbox for QSR in applicationsI reference implementations of QSRI typical QSR proceduresI uniform interface

http://www.sfbtr8.uni-bremen.de/project/r3/sparq/

Applications of QSR

qualitative simulation of physical systemssyntax and semantics of visual programming languagesdatabases integrationGISreal time event recognitionrobotics

Applications of QSR

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Tractability and computability

efficiency is inversely proportional to expressivityfirst order formulation of mereotopology is not decidable

I find a decidable subset (e.g. RCC8)I look for tractable subsets

Tractable subsets of QSR formalism is the subject of a number ofpapers. In particular (Renz e Cohn 2008) presents the followingingredients for finding tractable subsets of such formalism:

a method to prove that a given subset if tractablea method to suggest possible tractable subsetsin order to prove that a given set of relations is tractable, it issufficient to prove that the inclusion of any new relation makes theset intractable

J. Renz, Qualitative Spatial and Temporal Reasoning: EfficientAlgorithms for Everyone, in: Proc (IJCAI-07)

Tractable subsets of QSR formalism is the subject of a number ofpapers. In particular (Renz e Cohn 2008) presents the followingingredients for finding tractable subsets of such formalism:

a method to prove that a given subset if tractablea method to suggest possible tractable subsetsin order to prove that a given set of relations is tractable, it issufficient to prove that the inclusion of any new relation makes theset intractable

J. Renz, Qualitative Spatial and Temporal Reasoning: EfficientAlgorithms for Everyone, in: Proc (IJCAI-07)

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Our knowledge of the external world

(...) what is actually given in sense is much less than mostpeople would naturally suppose, and (...) much of what at firstsight seems to be given is really inferred. This appliesespecially in regard to our space-perceptions. For instance,we unconsciously infer the “real” size and shape of a visibleobject from its apparent size and shape, according to itsdistance and our point of view. (...). Thus, the first step in theanalyses of data, namely, the discovery of what is really givenin sense, is full of difficulty.

[B. Russell (1914), Our Knowledge ofthe External World, pp.75-76]

Although these ideas are at the foundations of QSR, their applicationin scene analysis is still work in progress.

Our knowledge of the external world

(...) what is actually given in sense is much less than mostpeople would naturally suppose, and (...) much of what at firstsight seems to be given is really inferred. This appliesespecially in regard to our space-perceptions. For instance,we unconsciously infer the “real” size and shape of a visibleobject from its apparent size and shape, according to itsdistance and our point of view. (...). Thus, the first step in theanalyses of data, namely, the discovery of what is really givenin sense, is full of difficulty.

[B. Russell (1914), Our Knowledge ofthe External World, pp.75-76]

Although these ideas are at the foundations of QSR, their applicationin scene analysis is still work in progress.

Outline I1 Preface

Where/what is FEI?

7 Conclusion

What is CogVis?from ECvision network

A Cognitive Vision System can achieve the four levels of genericvisual functionality: Detection, Localisation, Recognition,Understanding (role, context, purpose)and exhibits purposive goal-directed behaviour, is adaptive tounforeseen changes, and can anticipate the occurrence ofobjects and events. This is achieved through:

I Learning semantic knowledge (form, function and behaviours)I Retention of knowledge (about the cognitive system, its

environment, and the relationship with the environment)I Deliberation about objects and events, including the cognitive

system itself.

Some approaches for CogVis

VITRA system- Visual Translator: Integration of computer visionand natural language processingALVEN system - textual description of all heart dynamics viaX-Ray image sequences.Brand’s visual understanding through causal analysisSiskind’s systems - Event classification from camera input usingforce dynamicsLeeds traffic interaction - Modelling traffic interaction using learntqualitative spatio-temporal relations and variable length MarkovmodelsLeeds Protocol learning - Combining continuous and symbolicmodels to learn games from observationFEI contributions - Spatial reasoning image analysis

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Nagel’s Image Sequence Evaluation

Requirements and purpose:exhaustive internal representation for all tasks and experimentalconditions it is expected to handlesituation: suitable intermediate representation during theevaluation of image sequencesderive scene-specific conceptual descriptions from imagesequences, based on general assumptions about motion as thecause of observable changes

Image sequence evaluation is based on the idea that the flow of imagesequences reflects (to some extent) the coherency of the conceptualworld.

Intermediate levels of description:change: any deviation of sensor signal which significantly differsfrom noiseevent: any change which has been defined a priori as a primitivefor the construction of more complex descriptionsverb: describe activitieshistory: extended sequence of activities

Some early systems

Morio (Nagel 1983), Naos (Neumann 1986), Epex (Nagel 1987),CityTour (Herzog 1986)usually describing traffic scenes/situations from an static viewpoint(using motion verbs such as overtake, approach, move, beside,recedebut also to describe a walking person (Hogg 1983), heart motion(Tsotos 1980) and so on

Some early systems

Project Vitra: Visual Translator

long term project: 80’s and 90’sdesign and construction of integrated knowledge-based systemscapable of translating visual information into natural languagedescriptionsgeneral goals:

I extend the scope of scene analysis beyond the level of objectrecognition

I explicit description of spatial configurations by means of spatialrelations

I interpretation of object movementI automatic recognition of goals and plans of observed agents

specific goals:I answering queries about traffic situationsI generating reports of football gamesI communicating with a mobile robot

Overview of the system:from raw data, image analysis generates a geometricalrepresentation of the scene, with object’s locations through timethe geometric description is interpreted by the cognitive level (orhigh-level scene analysis)this high-level analysis extracts:

I spatial relationsI interesting motion eventsI presumed intentions, plans and plan interactions between

agents

the final, linguistic, level transform the conceptual descriptions intoutterances

agents

Alven (1984)

natural language description of ventricular wall motiondata extracted from X-ray image sequencesknowledge organisation: ontologyAnalyse pre-operative and post-operative marker films to evaluatethe efficacy of surgeryanalysis using both quantitative and qualitative representationsmajor issues:

I understanding visual motion from image tokens over timeI reasoning about spatio-temporal relationships

Alven (1984)

Alven’s knowledge base

Set of classes organised using relations aboutIS_A: generalisation/specialisations (taxonomy)PART _OF : part/whole (partonomy)temporal precedence

Should satisfy:1 motion classes should be sufficient to express the domain2 image tokens are connected to general knowledge in the leaves of

a PART _OF hierarchy of motion concepts

Alven’s knowledge base

Set of classes organised using relations aboutIS_A: generalisation/specialisations (taxonomy)PART _OF : part/whole (partonomy)temporal precedence

Should satisfy:1 motion classes should be sufficient to express the domain2 image tokens are connected to general knowledge in the leaves of

a PART _OF hierarchy of motion concepts

Alven’s control structure

extract tokens from the input signal, instantiating then in theleaves of the PART_OF hierarchyfollows this hierarchy to activated hypotheses that are aggregatesof the input tokensthis set of hypotheses is specialised by going down one level inthe IS_A hierarchyeach hypothesis is matched with other data instances: matchingleads to further specialisations, failure leads to the selection ofother hypothesesthe best hypotheses generate a set of predictionspredictions are mapped back to the image level and are used as aguidance to the token extraction procedure

Brand’s visual understanding through causal analysis

describe the causal structure from imagesknowledge about physical causality is used in the interpretationsceneshypothesis: a small core set of qualitative rules accounts for mostof what humans ordinarily seeit is possible to make useful inferences with qualitative knowledgeabout connectivity and free spacequalitative rules of connectivity, friction, attachment andpenetrationinterpretation of simple mechanical machinespurpose

I AnalyseI diagnosisI predictionI inspection

Brand’s SPROCKET (1997)

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Siskind’s Event classification from camera input usingforce dynamics

similar to Brand’s but on image sequencesforce dynamics: SUPPORTED, ATTACHED, PICKUP, PUTDOWN,CONTACTmore recently: supervised learning of visual event definitions fromvideo

Siskind’s Leonard System

Leeds Traffic Interaction

apply QSR to dynamic scene analysislearning events from the observation of traffic scenesrecognising these events and generate predictionsa scene is interpreted from the comparison of its temporaldevelopment with a (previously learned) transition diagramrepresenting change in the domain.

CND of relative positions

Leeds Traffic Interaction

Protocol learning

A system for learning protocol behaviour from computer visiondata using ILPUnsupervised continuous learning of perceptual categories andunsupervised symbolic learning of protocols (as sets of Hornclauses);

Protocol Learning

Off-the-shelf vision systemI Attention: identifies key frames with no motion preceded by a

number of frames with notion;I Statistical classifier: assigns different classes to clusters of features

Off-the-shelf ILP system: ProgolI Generalises a set of positive only examples according to user

defined mode declarationsI Mode declarations restrict the possible form of the proposed

generalisations

Protocol Learning

Inducing axioms of ordering and equivalence, without knowingabout numbersBuilding equivalence classes to cope with over clusteringsound and completeness of the agent actually playing the game[AIJ 05]

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Example of QSR from Vision: Depth Profile Calculus

How much knowledge about a robot’s environment can be fromvision alone?How can we construct the knowledge about objects in the worldfrom sensor data of the robot?Construction of a qualitative spatial reasoning (QSR) systembased on sensor dataUse abduction for sensor data assimilation and deduction forpredictions

Simplified environment

Spatial Reasoning about Robot Sensor Data

Attributes:I Distance, disparity, size;I Changes in the sensor data;

Representation:I Depth profiles and time points;I Displacement between regions;I Mapping function between images and objects

Assimilating changes

Axioms of the system:

< Dynamic spatial rel > ← < desc. sensor transition >< Dynamicspatialrel . > ← < obj − obs relation >

Depth Profiles

The model

Extract one horizontal depth profile of each scene from the visualdata;Objects in the scenes are represented as peaks;Axiomatise relations on the depth and size of these profiles aswell as displacements;

Depth Profile Calculus

A theory about displacement, size and depth;27 base relations;Large and complex conceptual neighbourhood diagrams andcomposition tables;

Depth Profile Calculus

DPC example

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Reasoning about Shadows in Robotics

making explicit the knowledge contained in cast shadowsuse it to reason about the robot environmentComputer vision however has largely been filtering out castshadows as noise

Illusory Motion from Shadows

Perception of shadows

“no luminous body ever sees the shadows that it generates” [daVinci, Notebooks of Leonardo Da Vinci. Project Gutenberg (1888)]from the light source viewpoint shadows are occluded by theircastersWe model observer-caster-shadow within qualitative spatialreasoning: ROC + an axiom about shadow :

Shadow(s,o,Scr ,L)↔ PO(r(s), r(Scr))∧TotallyOccludes(o, s,L)∧

¬∃o′TotallyOccludes(o′,o,L).

“a shadow is totally occluded by its caster from the lightsourceviewpoint”

Qualitative regions for self-localisation

In practice

Qualitative robot self-localisationrelative depth from the observation of shadowsthreshold finding from qualitative regions

In practice

located(Region 1, ν,o, s)← Is_a_Shadow(s,o) ∧NonOccludesDC(o, s, v) ∧ v 6= o;

located(Region 2, ν,o, s)← Is_a_Shadow(s,o) ∧NonOccludesEC(o, s, v) ∧ v 6= o;

located(Region 3, ν,o, s)← Is_a_Shadow(s,o) ∧PartiallyOccludesPO(o, s, v) ∧ v 6= o;

located(Region 4, ν,o, s)← Is_a_Shadow(s,o) ∧TotallyOccludesTPPI(o, s, v) ∧ v 6= o;

located(Region 5, ν,o, s)← Is_a_Shadow(s,o) ∧TotallyOccludesNTPPI(o, s, v) ∧ v 6= o.

Outline I1 Preface

Where/what is FEI?

7 Conclusion

Probabilistic Logic Encoding of Spatial Domains

incorporate incomplete sensor data and domain knowledge in aprobabilistic logic settingexplore inferences about space

Traffic Scenario

Sensor model

Taxonomy of concepts

Lane = GoingUp ∪ GoingDown ,Divider = DashedDivider SolidDivider ,Vehicle = OnOneWayRoad ∪ OnTwoWayRoad ,disjoint(Vehicle, Divider, Lane) ,disjoint(GoingUp, GoingDown) ,disjoint(DashedDivider, SolidDivider) ,disjoint(OnOneWayRoad, OnTwoWayRoad) ,

Hard Constraints

GoingUp ⊂ ∀ri .(GoingUp ∪ ¬Lane),

GoingDown ⊂ ∀le.(GoingDown ∪ ¬Lane),

DashedDivider ⊂ ∃ri .Lane u ∃le.Lane,SolidDivider ⊂(¬∃ri .Lane ∪ ¬∃le.Lane) ∪ (∃cdc.GoingUp u ∃cdc.GoingDown)

OnTwoWayRoad ⊂ ∃cdc.OneWayNorth u ∃cdc.OneWaySouth

Generate a Bayesian Net out of it

Answer queries such as

On which lane are we? ArgmaxliP(v : Onli) : li is the lane withmaximum probability of being the vehicle.Which driving directions does each lane permit?∀i : P(li : GoingDown): for each lane li, the probability of being aGoingUp lane

Conclusion

We’ve presented a brief overview of QSR and CogVisDiscussed some early work on CogVis systemspresented some new work based mainly at UoL and FEIMuch work remains to be done!

Conclusion

Guide lines

A. Cohn, Slides of the tutorial: Knowledge Representation andReasoning for Computer Vision: Qualitative Spatio-temporalreasoningA. Cohn and J. Renz, Qualitative Spatial Representation andReasoning, Handbook of Knowledge Representation, 2008B. Bennett, Logical representations for automated reasoningabout spatial relationships, PhD thesis, School of Computing,University of Leeds, UK.P. Santos, Raciocínio e Percepção espacial: uma abordagemlógica. Working notes of CBA 2010 tutorial. Tutoriais do XVIIICongresso Brasileiro de Automática, 2010.

Thanks!

This author was partially supported by:FAPESP LogProb project:2008/03995-5CNPq bolsa PQ

raciocínio e percepção espacial: uma abordagem lógicapsantos/slidesmc5.pdf · raciocínio e...

Documents