a fault-tolerant distributed vision system architecture for ...a fault-tolerant distributed vision...

A Fault-Tolerant Distrib uted Vision SystemAr chitecture for Object Tracking in a

Smart Room�

DeepakR. Karuppiah,ZhigangZhu,PrashantShenoy, andEdwardM. Riseman

Departmentof ComputerScience,Universityof Massachusetts,Amherst,MA - 01003,USA�

deepak|zhu|shenoy|riseman � @cs.umass.edu

Abstract. In recentyears,distributedcomputervision hasgaineda lot of atten-tion within thecomputervisioncommunityfor applicationssuchasvideosurveil-lanceandobjecttracking.Thecollective informationgatheredby multiple cam-erasthatarestrategically placedhasmany advantages.For example,aggregationof informationfrom multipleviewpointsreducestheuncertaintyaboutthescene.Further, thereis no singlepoint of failure,thusthesystemasa wholecouldcon-tinue to performthe taskat hand.However, the advantagesarisingout of suchcooperationcanbe realizedonly by timely sharingof the informationbetweenthem.This paperdiscussesthedesignof a distributedvision systemthatenablesseveralheterogeneoussensorswith differentprocessingratesto exchangeinfor-mation in a timely mannerin order to achieve a commongoal, say trackingofmultiple humansubjectsandmobilerobotsin anindoorsmartenvironment.In our fault-tolerantdistributedvision system,a resourcemanagermanagesin-dividual camerasandbuffers the time-stampedobjectcandidatesreceived fromthem.A UserAgentwith agiventaskspecificationapproachestheresourceman-ager, first for knowing the availableresources(cameras)andlater for receivingthe objectcandidatesfrom the resourcesof its interest.Thusthe resourceman-ageractsasa proxy betweenthe useragentsandcameras,therebyfreeingthecamerasto dodedicatedfeaturedetectionandextractiononly. In suchascenario,many failuresarepossible.For example,oneof thecamerasmayhaveahardwarefailureor it maylosethetarget,which movedaway from its field of view. In thiscontext, importantissuessuchasfailuredetectionandhandling,synchronizationof datafrom multiple sensorsandsensorreconfigurationby view planningarediscussedin thepaper. Experimentalresultswith realsceneimageswill begiven.

1 Intr oduction

In therecentyears,rapidadvancesin low cost,highperformancecomputersandsensorshave spurreda significantinterestin ubiquitouscomputing.Researchersarenow talk-ing aboutthrowing in a lot of differenttypesof sensorsin our homes,work placesand�

This work is supportedby DARPA/ITO Mobile AutonomousRobotsS/W (MARS) (ContractNumberDOD DABT63-99-1-004)and Software for Distributed Robotics(SDR) (ContractNumberDOD DABT63-99-1-0022)

evenonpeople.They hopethatthewealthof informationfrom thesesensorswhenpro-cessedandinferredcarefullywould significantlyenhanceour capacityto interactwiththe world aroundus.For instance,today, humansrely largely on their innatesensoryandmotormechanismsto understandtheenvironmentandreactto thevarioussituationsarisingthereof.Thoughour intelligenceis far superiorto today’s AI, thememoryandnumbercrunchingcapacityof anaveragepersonleavesmuchto bedesired.But with adistributedbackboneof processorsandsensorsaugmentingourbrainandsenses,elabo-rateinformationgatheringandcomplex andsystematicdecision-makingcouldbecomepossiblefor everyone.A smartenvironmentcouldassisthumansin theirdaily activitiessuchasteleconferencing,surveillanceetc.Thesmartenvironmentideais thereforenotto replacea humanbut to augmentone’s capacityto do thingsin theenvironment.Aninterestingperspective into this areafrom the machinevision point of view hasbeenprovidedin [14].

Distributedcomputervision formsavital componentin asmartenvironmentduetothe rich informationgatheringcapabilityof vision sensors.Thecollective informationgatheredby multiple camerasthat arestrategically placedhasmany advantages.Forexample,aggregationof informationfrom multiple viewpointsreducestheuncertaintyaboutthescene.Further, thereis no singlepoint of failure,thusthesystemasa wholecould continueto perform the task at hand.However, the advantagesarising out ofsuchcooperationcanbe realizedonly by timely sharingof the informationbetweenthem.The distributed systemcan then sharethe information to carry out taskslikeinferringcontext, updatingknowledgebase,archiving etc.A distributedvision system,in general,shouldhave thefollowing capabilities

– Extractionof usefulfeaturesetsfrom raw sensordata– Selectionandfusionof featuresetsfrom differentsensors– Timely sharingof informationamongthesensors– Fault-toleranceandreconfiguration

This paperdiscussesthe designof sucha distributed vision systemthat enablesseveralheterogeneoussensorswith differentprocessingratesto exchangeinformationin a timely mannerin orderto achievea commongoal,saytrackingof multiple humansubjectsaswell asmobilerobotsin an indoorenvironment,while reactingat run-timeto variouskindsof failures,including:hardwarefailure,inadequatesensorgeometries,occlusion,andbandwidthlimitations.Respondingat run-timerequiresa combinationof knowledgeregardingthephysicalsensorimotordevice,its usein coordinatedsensingoperations,andhigh-level processdescriptions.

1.1 RelatedWork

Theproposedwork is relatedto two areasin literature- multi-sensornetwork anddis-tributedself-adaptive software.Researchon multi-sensornetwork devoted to humantrackingandidentificationcanbefoundin [4], [13], [12], [14] and[15]. An integratedsystemof active cameranetwork hasbeenproposedin [16] for humantrackingandfacerecognition.In [2], a practicaldistributedvision systembasedon dynamicmem-ory hasbeenpresented.In our previous work [17], we have presenteda panoramic

virtual stereofor humantrackingandlocalizationin mobile robots.However, mostofthecurrentsystemsemphasizeon vision algorithms,which aredesignedto functioninaspecificnetwork. Importantissuesconcerningfault-toleranceandsensorreconfigura-tion in a distributedsystemof sensorsareseldomdiscussed.

Theseissuesareaddressedto someextent in the secondareanamelydistributedself-adaptive software.Much of currentsoftwaredevelopmentis basedon the notionthatonecancorrectlyspecifyasystemapriori. Suchaspecificationmustincludeall in-put datasets,which is impossible,in general,for embeddedsensorimotorapplications.Self-adaptive software,however, modifiesits behavior basedon observedprogressto-wardgoalsasthesystemstateevolvesat run-time[8]. Currentresearchin self-adaptivesoftwaredrawsfrom two traditions,namelycontroltheoreticandplanning.Thecontroltheoreticapproachto self-adaptive softwaretreatssoftwareasa plant with associatedcontrollabilityandobservability issues[7]. Time-criticalapplicationsrequiretheabilityto actquickly without spendinglargeamountsof time on deliberation.Suchreflexivebehavior is the domainof the control theoretictradition. Drawing from the planningcommunity, a genericsoftware infrastructurefor adaptive fault-tolerancethat allowsdifferent levels of availability requirementsto be simultaneouslysupportedin a net-workedenvironmenthasbeenpresentedin [6]. In [1], adistributedcontrolarchitecturein which run-timebehavior is both pre-analyzedandrecoveredempirically to informlocal schedulingagentsthatcommitresourcesautonomouslysubjectto processcontrolspecificationshasbeenpresented.

1.2 Ar chitectureOverview

The proposeddistributed vision systemhasthreelevels of hierarchy- sensornodes( �� ...�� ), resourcemanagers( �� ...� � ), anduseragents( �� ...�� ), asshownin Fig. 1. Thelowestlevel consistsof individual sensorslike omni-directionalcamerasand pan-tilt-zoomcameras,which perform humanand facedetectionusing motion,color andtexture cues,on their datastreamsindependentlyin (near)real-time.Eachsensorreportsits time-stampedobjectcandidates(bearing,sizes,motion cues)to oneor moreresourcemanagersat thenext level. Thecommunicationprotocolbetweenthesensorandtheresourcemanagercouldbeeitherunicastor multicast.A resourceman-ageractsasa proxy by makingtheseobjectcandidatesavailableto the useragentsatthe topmostlevel. Thusthe resourcemanagercould serve many useragentssimulta-neously, freeingthesensorsto do dedicatedfeaturedetectionandextractiononly. Theuseragent,in our application,matchesthe time-stampedobject candidatesfrom themostfavorablesensors,estimates3D locationsandextractstracksof moving objectsintheenvironment.Therecouldbeotheruseragentsthatusethesameor differentsensorinformationbut with a differenttaskspecificationaswell.

All thecomponentsof thesystemcommunicateusingtheEthernetLAN. ThusNet-work Time Protocol(NTP) is usedto synchronizethe local clocksof the nodesafterjustifying thatthesynchronizationresolutionprovidedby NTPis sufficientfor ourtask.For furtherdetailsin implementationandapplicationsof NTP, thereaderis referredto[10] and[11].

.

.

RM 1

RM K

UA M

UA 1

.

.

S1

S2

S3

SN

UA : User Agent RM : Resource Manager S : Sensor Node

UA 2

Fig.1. SystemArchitecture

2 SensorNodes

The typical flow of informationat a sensornodeis shown in Fig. 2. The lowestlevelis the sensorlayer. A sensornodeconsistsof a physicalsensorand a processor. Ingeneral,thesensornodecouldalsohavemotorcapabilities,for example,a camerathatcould panor a camerathat is mountedon a robot.The processorcould be a desktopcomputeror a simpleembeddedprocessordependingon the computationalneedsofthe sensors.For example,a 68HC11processoris sufficient for a simplepyro-electricsensor, whichdetectstemperaturechanges.But apowerful desktopcomputeris neededfor runningalgorithmsfor motion detectionusingvision sensorsin real-time.In anycase,thenodesshouldbeableto connectto aLocalAreaNetwork (LAN). Thisenablesthemto communicatewith the layer immediatelyabove in hierarchy. Two suchvisionsensornodesusedin ourhumantrackingsystemwill bediscussedin Sec.6.

Pre-processing Noise filter

Feature Extraction

Sensor Stimuli

Resource Manager Interface

Sensor Node

Fig.2. SensorNode

At thesensorlevel, thephysicalsensorperceivestheenvironment,typically withina certainfrequency spectrumof electromagneticwaves.Theraw datais then,digitized

by thedevicedriversof thesensor. Thedigitizeddatais pre-processedto getrid of ran-domandsystemicnoise(if thenoisemodelsareavailable).Thenoise-freedatais thenusedto extractusefulfeaturesof interest.Thefeaturesthusextractedarestreamedoutto theresourcemanagervia theResourceManagerInterface(RM-Interface),shown inFig. 2. Theextractedfeaturesetfalls into two classesnamelybasicfeaturesandspecialfeatures.Typically, basicfeaturesarecommonto all thesensornodes,requirelow band-width andarestreamedby default to the resourcemanagerwhile the specialfeaturesarenodespecific,requirehigh bandwidthandareprovidedon-demandby useragents.For the taskof trackingandlocalizationof humansubjectsin a smartroom,thebasicfeaturesincludethebearingandsizeof moving subjectswhile specialfeaturesincludetheir texture,color andmotioncues.Thespecialfeaturescanbeusedto matchtheob-jectsacrosstwo camerasand therebydeterminetheir 3-D location.Objectmatchingusingspecialfeaturesis furtherelaboratedin Sec.6.

Whena sensorcomesonline, its RM-Interfacereportsto a resourcemanager, thesensor’s uniqueID followedby its locationandgeometryin a global referenceframe.On receiving a confirmationfrom the RM, it activatesthe sensor’s processingloop,whichdoesthemotiondetectionandperiodicallyreportsthefeature.TheRM-Interfaceis capableof receiving commandsfrom the resourcemanager. Somecommandsaregenerallikepausingoperationor changingthereportingrate.Othersarespecificto theresourcelikemotioncommandsto a PTZplatformor a mobilerobot.

3 ResourceManager

A resourcemanagerstructureis shown in Fig. 3. The resourcemanager(RM) is thevia-mediumbetweenthe producersof information (the sensornodes)and their con-sumers(the useragents).The resourcemanagerkeepstrackof the currentlyavailablesensorsandreportstheir statusto the useragentperiodically. The useragentchoosesthe bestsensorsubsetfrom the availablepool to accomplishits goals.The resourcemanager, therefore,actsasa proxy betweenthe sensorsanduseragents.Thusthe re-sourcemanagercould serve many useragentssimultaneously, freeing the sensorstodo dedicatedfeaturedetectionandextractiononly. This way thesensorsarenot com-mitted to a singleagent,but couldbesharedamongmany agents.However, themotorfunctionsof a nodecannotbesharedbecausethey causeconflictswhenmorethanoneagentattemptsto performa motor taskon the samesensornode.So,a lock managermanagesthemotor functionsof a node.Thereis alsothe facility of maintainingmul-tiple resourcemanagerssimultaneously(seeFig. 1). This improvesfault-toleranceintheeventof failureof a particularresourcemanagerandimprovesperformancein re-ducingloadperresourcemanager. While usingmultiple resourcemanagers,thebetterbandwidthutilization is achievedby employing multicastcommunicationprotocol.Insuchascenario,asensornodepushesthedataonthewire only onceaddressingit to themulticastgroupto which theresourcemanagersbelong.Themulticastbackbone,then,efficiently routesthedatato all themembersof thegroup.

User Agent Proxy 1

Data Buffer 1

Data Buffer 2

Data Buffer 3

Data Buffer 4

User Agent Proxy 2

Multi-threaded data buffer

From

Sen

sor

node

s Fig.3. ResourceManager

3.1 Multi-thr eadedData Buffer

In theRM, theMulti-threadedDataBuffer (MDB) collectsthesetime-stampedfeaturesetsfrom differentsensorsandbuffersthem.WhentheRM-Interfaceof asensorbeginstheregistrationprocessbysendingitsuniquesensorID, theMDB checksfor thevalidityof theID (i.e. if it is a trustedandknown sensor).This simplesecuritycheckpreventstheMDB from beingfloodedby unscrupulousrequestsfor service.After validation,athreadis spawnedto serve thatparticulartypeof sensor. Thebuffering processthat iscrucialfor synchronizationof datafrom differentsensorsis discussedin Sec.5.

3.2 User Agent Proxy

UserAgentProxy(UA-proxy) is theendpoint in theRM, whichservestheuseragents.Similar to MDB, UA-proxy registersandvalidatesauseragentandathreadis spawnedto begin the information exchangewith the useragent.UA-proxy provides the useragentswith the list of currentlyavailablesensors.Dependingon a useragent’s choicefrom this list, the UA-proxy delivers the correspondingsensorparametersand theirtime synchronizedfeaturesetsto theuseragent.In Fig. 3, thesolid linesbetweentheUA-proxiesandthedatabuffersin MDB indicatetherespectiveuseragent’s choiceofsensors.

Thestatesof sensorsarepushedto theuseragentswhenoneof thefollowing occurs

– A new sensorhasregisteredor,– A sensorhasfailed.

Dependingon this information,theUserAgentmodifiesits choicesetandinformstheUA-Proxy which thenrespondsaccordingly. TheUserAgentcanalsorequestspe-cial featuresin additionto thedefault featuresetfrom sensorsvia theResourceMan-ager.

4 User Agents

UserAgentsareprocessesthatachieve a specificgoalby usingmultiple sensorimotorresources.Whenthey go aboutdoing this, the availability of the requiredresourcesisnotalwaysguaranteed.Evenif all theresourcesareavailable,theenvironmentalcontextmay prevent themfrom reachingthe goal.This may requiresomeaction in the formof redeploymentof sensorsto handlethe new context. Useragentsare thusadaptiveto hardware,softwareandenvironmentalcontexts. A straightforwardway to realizeauseragentis to identify useful systemconfigurationsand assigna behavior to eachconfiguration.Sincethecombinatoricsof thisprocedureis prohibitivewhentherearealargenumberof sensors,aninductivemethodcouldbeused.In thismethod,thesystemcould learnfrom pastbehavior andidentify usefulsystemconfigurationsby itself. Anexampleof useragentusedin oursystemwill begivenin Sec7.

5 Time Synchronization Mechanism

The ResourceManagerplaysa vital role in making the sensorsavailableto the useragentsin a timely andfault-tolerantfashion.In thissectionwediscusshow thefeaturesfrom multiplesensorsaresynchronizedin time.A simpleway to solvethiswouldbetosynchronizeall thecamerasusinganelectronictriggersignal.However, this hardwaresolution is not generalenoughto handleheterogeneoussensorsandmobile sensors,which arean inevitablepartof a smartenvironment.Evenin thecaseof anall cameranetwork, suchsynchronizationcould prove to be difficult and expensive, especiallywhentherearemany camerasscatteredoverawidearea.With theobjectiveof providingasimple,inexpensiveyetgeneralsolution,weproposeasoftwaresolutionexploiting theexisting time synchronizationprotocol(NTP) runningin today’scomputernetworks.

Oncethe computernodesaresynchronizedby NTP, the sensorsattachedto themcould work independentlyand time-stamptheir datasetbasedon their local clocks.Thesedatasetsarebufferedat the multi-threadeddatabuffer (MDB) in the resourcemanager. Whena reporthasto besentto a UA, its UA-proxy in theresourcemanagerqueriesthe MDB for the currentdatafrom the useragent’s sensorset.The query isparameterizedby thecurrenttime.Eachsensor’s featuresetthat is closestin time maybe returned.This methodworks whendifferentsensorsprocessat approximatelythesamerate.However, if their processingratesdiffer by a wide margin, this procedurecouldleadto errors.

( , t) φ( , t ) 2 2θ

( , t) θ

1

2

3

4

5PANO_2

PANO_1

Track

( , t ) θ1 1

Fig.4. Illustration of errorsdueto lack of synchronizedmatchingacrosstwo sensors.PANO 2 detectstheobjectat whitecircles1, 3, & 5 while PANO 1 doesat 2 & 4. Thedarkcirclesshow thelocationestimatesusingthemostrecentbearingsfrom thesensorswhile thegraycircleshows theestimateat time t wheninterpolationis used

Let’sdemonstratethis by a simpleexamplein which only thebearinganglesto themoving subjectsfrom thesensorsareused.Suppose,two panoramicsensorsPANO 1

andPANO 2 areregisteredin theRM andthey reporttheirbearings(to asinglemovingobjectin thescene)at differentratesasshown in Fig. 4. If UA-proxy wereto pushthelatestreportedfeaturesetsfrom the sensorsto the UA, the resultof fusion would beinaccurate,asshown in Fig. 4, becauseof timing discrepancy betweenthe two sets.Thewhitecirclesshow therealpositionsof themoving objectalongthetrack.Thedarkcirclesshow theerrorin triangulationcauseddueto matchingdatasetsnotsynchronizedin time.

Soin suchsituations,usingsuitableinterpolationtechniques(polynomialsorsplines),thefeaturesetsmustbeinterpolatedin time.Interpolationalgorithmmustalsobeawareof certainpropertiesof thefeaturesetslike theangularperiodicity in thecaseof bear-ings. MDB is capableof interpolatingthe databuffer to returnthe featuresetvaluesfor thetimerequestedby theUA-proxy (seeFig. 5). Thisensuresthefeaturesetsbeingusedto fuseareclosein time.However, it shouldbenotedthatfor doingtheinterpola-tion,weneedto assumethatalinearor splineinterpolationwill approximatethemotionof a humansubjectbetweentwo time instants,�� and �� . In a typical walk of a humansubject,the motion track canbe approximatedto be piecewise linear, so the bearingrequestedfor time � in PANO 1 canbecalculatedas

�� !�"��#��$

� ��% (1)

where� � and

� � arebearingsmeasuredin PANO 1 at time � � and � � respectively. Thetime � usedin the above equationis the time for which a measurementof bearing,&

, is availablefrom PANO 2. The angularperiodicity of bearingangleshasnot beenshown in Eqn. 1 for the sake of simplicity. However it hasbeenincorporatedin theinterpolationcalculationson therealsystem.We canseethattheresultof triangulationusingbearings

�(in PANO 1) and

&(in PANO 2), representedby agraycircle in Fig.4,

hasreducedtheerrorconsiderably. Realsceneexampleswill begivenin Sec. 7.

User Agent Proxy Clock

Data Buffer 1

Data Buffer 2

Interpolate

Interpolate

Fig.5. InteractionsbetweenUserAgentProxyandDataBuffer

This algorithmwill work providedall the different layersin the hierarchyhave aglobal notion of time. Using Network Time Protocol(NTP), onecould achieve timesynchronizationresolutionof theorderof 5msamongthenodesof a Local AreaNet-work (LAN). If thefastestsensorin thesystemwouldtakeat leasttwicethistime(every10ms)to producea report,thenthis resolutionis acceptable.Typically this is valid be-causecamerashaveaframerateof 25Hzwith addedoverloadin processing(i.e.areportat every40msat mostfrom a camera).

6 Vision SensorNodes

6.1 Panoramic Camera Nodes

Effective combinationsof transductionandimageprocessingis essentialfor operatingin anunpredictableenvironmentandto rapidly focusattentionon importantactivitiesin the environment.A limited field-of-view (aswith standardoptics)often causesthecameraresourceto beblockedwhenmultiple targetsarenotclosetogetherandpanningthecamerato multiple targetstakestime. We employ a camerawith a panoramiclens[3] to simultaneouslydetectand track multiple moving objectsin a full 360-degreeview.

Figure6 depictsthe processingstepsinvolved in detectingand trackingmultiplemoving humans.Four moving objects(people)weredetectedin real-timewhile mov-ing in the scenein an unconstrainedmanner. A backgroundimageis generatedauto-matically by trackingdynamicobjectsthoughthe backgroundmodeldependson thenumberof moving objectsin thesceneandtheir motion.Eachof thefour peoplewereextractedfrom thecomplex clutteredbackgroundandannotatedwith a boundingrect-angle,adirection,andanestimateddistancebasedonscalefrom thesensor. Thesystemtrackseachobjectthroughtheimagesequenceasshown in Fig. 6, evenin thepresenceof overlapandocclusionbetweentwo people.Thedynamictrack is representedasanelliptical headandbodyfor the last30 framesof eachobject.Thehumansubjectsre-verseddirectionsandoccludedoneanotherduringthissequence.Thevisionalgorithmscandetectchangein the environment,illumination, andsensorfailure,while refresh-ing the backgroundaccordingly. The detectionrateof the currentimplementationfortrackingtwo objectsis about5Hz.

With a pair of panoramicsensornodes,3D location of multiple moving objectscanbe determinedby triangulation.This pair could thenact asa virtual sensornodewhich reportsthe 3D locationof moving objectsto the resourcemanager. But therearetwo issuesthatmustbeconsiderednamely, matchingobjectsfrom widely seperatedviewpointsandtriangulationaccuracy whenthe object is collinearwith the cameras.Robustmatchingcanbedoneby usingspecialfeaturesfrom thesensornodeslikesize,intensityandcolorhistogramof themotionblobsin additionto thebasicfeatures[17].Whentheobjectis in collinearconfigurationwith thecameras(andif they aretheonlyonesavailableat that instant),we employ the size-ratiomethoddescribedin [17]. Byusingtheratioof thesizeof objectsasseenfrom thetwo cameras,their distancesfromeachcameracouldbecalculated(seeFig.7a).Figure7b showstwo subjects(walkinginoppositedirectionsonarectangulartrack)beingsuccessfullytrackedby thealgorithm.

')( '+* '�, '�-

Fig.6. Motion detection& trackingin apanoramiccamera

Thereareoutliersin thetrackcauseddueto falsedetectionsandmismatcheswhichcanberejectedusingthefault-tolerancemechanismdescribedlaterin Sec.7.1

6.2 Pan-Tilt-Zoom CameraNodes

The PTZ camerasareanothertype of sensorsusedin our system.A PTZ cameracanfunction in two modes- 1) motion detection,and 2) target tracking. In mode1, thecameraremainsstill and functionsin exactly the sameway asa panoramiccamera,except that now it hasa narrow field of view with high resolution.Thoughthe areacoveredis very limited dueto narrow field of view, it is betterthanpanoramiccamerafor facedetection.In mode2, thecameragetsinformationabouta moving target fromthehigherlevel andtriesto pursuethetarget.In thismode,it couldcontinuouslyreceiveinformationfrom thehigherlevel for targetlocation,or it couldtakeovertrackingitself(seeFig. 8). Whenevera faceis detectedin its field of view, it couldzoomin andsnapa faceshotof the moving personimmediately. Faceshotsfall underthe catergory ofspecialfeaturesfor this sensornodeandaresentto useragentsondemand.

7 Experimental Results

In this section,the implementationof the systemaswell aspreliminaryexperimentalresultshasbeendescribed.Thesmartroomconsistsof four vision sensorsto monitorthe activities of humansubjectsenteringthe room. Our experimentswereconductedusingdifferentarrangementsof sensorswhich areshown in Figs.11 & 13. Two of thesensorsarepanoramiccamerasrepresentedasPano-I andPano-II respectively, whiletheothertwo areSony Pan-Tilt-Zoom (PTZ) camerasrepresentedasPTZ-I andPTZ-II respectively. A single resourcemanagercoordinatesthe camerasand reportstheiravailability to theTrackUserAgent(Track-UA). Thegoalof this useragentis to trackmultiple humansin the smartenvironmentandget faceshotsof the humansubjects(seeFig. 9 andFig. 10). Theefficacy of thesystemcanbeevaluatedbasedontwo crite-ria, namelyfault-toleranceandaccuracy of thetracking.Theformercriterionevaluatesthe usefulnessof hierarchicaldesignwhile the latter evaluatesvision algorithmsandsynchronizationbetweenmultiplesensors.

(a) Size-ratiomethodusedin collinearregionsof thetrack

(b) Falsematchesin thecenterof thetrackcouldberectifiedby a third camera

Fig.7. A virtual sensornodecomprisingof two panoramiccamerasproviding 3D loca-tionsof objectsin thescene

Fig.8. Illustrationof usingflow to trackmoving objectin a PTZcameranode

')( '+*

')( '+*

Fig.9. Framesfrom PANO-I & PANO-II showing objects. � & . � beingtracked

Fig.10.Faceshotsof . � from PTZ-I and . � from PTZ-II respectively

7.1 Fault-ToleranceEvaluation

Thefirst criterionwasevaluatedby generatingfaultsat thesensornodesandobservinghow the systemreacts.As explainedin Sec.3, the resourcemanageris at the coreofthe fault-tolerantoperationof the system.In our system,the resourcemanagermain-tainsanavailability list. This list is pushedto theTrack-UA uponoccurrenceof certaineventslikeasensorcomingonlineor asensorfailure.TheTrack-UA usestherule-baseddecisionmakingengineshown in Table1 to takeappropriateactions.

Table 1. Therule-baseddecisionmakingenginein Track-UA

Availability ActionAt leastonecamera Keeptrackof theheadingof thehumansubjectsPanoramic PTZ pair that arecloseto eachother

Usethebearinginformationfrom panoramiccamerato panthePTZtowardsthemostdominanthumansubject

Two PanoramicandonePTZ Matchobjectsacrossthepanoramiccameras.Triangulatetofindthe3-D locationof eachmatchedobject.UsethePTZto look atoneof theobjects

Two Panoramicandtwo PTZ Sameasprevious state,exceptassignthe two of theobjectstothetwo PTZ

Thesystemreactingto theeventof Pano-II failing is shown in Fig. 11. Whenboththe panoramiccamerasareavailable, the 3-D locationof the moving subjectcan beestimatedby triangulation(Fig. 11(a)). In this caseany PTZ camera(PTZ-I or PTZ-II)canbeassignedto focuson thehumansubject.However if Pano-II fails, we canonlyusePano-I to estimatethebearingof thesubject.If no othercamerasareavailablefor3-D localizationvia triangulation,we canonly usePTZ-I that is closelyplacedwithPano-Ito obtainthefaceof thehumansubject.

(a) (b)

Fig.11.Pano-IandPano-II areavailablein (a).Pano-II failedin (b)

In Sec.6.1, weshowedtwo kindsof failureswhenonly apairof panoramiccamerasis usedfor 3D-localizationnamely, poortriangulationatcollinearconditionsandstereomismatches.Both theseerrorscan be detectedandcan be easily handledby a thirdcameraasshown in Fig.12. Collinearconditionscanbedetectedeasilyfromthebearingangles.The triangulationaccuracy undersuchconditionscan be improved by eitherusinga third cameraor by using the size-ratiomethodwhenone is not available.Amoredifficult problemis thedetectionof falseobjectsdueto stereomismatches.Againby verifying eachobjectcandidateswith a third camera,falseonescanberejected.

Camera−1 Camera−2

Camera−3

T

T

T

Tf

1

2

3

Fig.12. Object / � is collinear with Camera-1and Camera-2but can be localizedby (Camera-1,Camera-2)pair using size-ratiomethodor better by either (Camera-1,Camera-3)pairor (Camera-2,Camera-3)pairusingtriangulationmethod.A mismatchcould result in the falseobject /10 which can easily be discardedwhen verified byCamera-3

In amulti-sensornetwork, informationfrom many camerascanthusprovideabetterdegreeof fault-toleranceandallow for dynamicresourcereconfigurationto performaparticulartask.Resultsfrom a real run areshown in Fig. 13. Figure 13a shows twosensorstriangulatingat themoving object.But astheobjectmovesfurtherto theright,it is occludedfrom the view of one of the sensors.So in Fig. 13b, we observe thatanothersensoris broughtin to continuetracking.Thesameprocessis repeatedfor thecaseof collinearsensorgeometryin Figs.13c & 13d.Wehavediscussedfault-toleranceby sensorreconfigurationin moredetail in [5].

(a)

(b)

(c)

(d)

Fig.13. Framesfrom a tracking task demonstratingpair-wise sensormodechangesinducedon occurrenceof faults like object track lost by oneof the sensors(a-b), orcollinearsensorgeometry(c-d)

7.2 Synchronization Results

In this experiment,a personwalkedalonga pre-determinedpathat a constantvelocityandtwo panoramiccameras- Pano-IandPano-IIareusedto trackthemotion(Fig. 14).Beforethestartof theexperiment,thelocal clocksonall thesensornodesaresynchro-nizedusingNTP. In this experiment,Pano-II is setto processtwice asfastasPano-I.Theresultof targettrackingunderthissituationis shown in Fig.14a.Wecannoticethatthe error in the track result is quite large with a meanof around60cmdueof lack ofsynchronization.After we employedthe interpolationmethoddiscussedin Sec.5, themeanlocalizationerroris reducedwithin 15cm(Fig. 14b).

(a) (b)

Fig.14. (a)Unsynchronizedtrackingand,(b) Synchronizedtracking

8 Conclusion

A distributedsensornetwork architecturecomprisingof threelevels of hierarchyhasbeenproposedin thispaper. Thehierarchyconsistsaresensornodes,resourcemanageranduseragents.The resourcemanageractsasa proxy betweenthe sensornodesandthe useragentsallowing many useragentsto simultaneouslysharesensorresources.The systemwas implementedusing two typesof vision sensors,namelypanoramiccamerasandpan-tilt-zoomcameras.The systemwasevaluatedfor its fault-toleranceperformanceandaccuracy of tracking.A simple,cost-effective way of synchronizingdatastreamsfrom heterogeneoussensorsusingNTPwasdiscussedandtheexperimen-tal resultsshowedthepracticalutility of thisapproach.

Giventhegeneralnatureof theproposedarchitecture,it is possibleto adddifferenttypesof sensorssuchasacousticandpyroelectricsensors,andusethemto performa variety of tasks.The systemwill be extendedto realizeits full potentialof havingmultiple useragents,eachpursuinga specificgoal in the smartenvironment,simulta-neouslyusingmultiple resourcemanagersin themulti-sensorframework. Thesystem

couldbe furtherextendedto provide a humaninterfaceby building semi-autonomoususeragents.A semi-autonomoususeragentcouldinteractwith ahumanwhocanmakedecisions.While actingunderguidance,theagentcanlearnandincreaseits confidencein handlingcertainsituations.Whenit encounterssimilar situationsin thefutureit canautonomouslyactwithoutguidance.

Acknowledgements

We would like to thankothermembersof our researchteam- Gary Holness,Subra-manyaUppala,S.ChanduRavela,andProf.RodericGrupen- for their involvementinthe developmentof this research.We would alsolike to thankYuichi Kikuchi for hiscontribution to thedevelopmentof thePan-Tilt-Zoom sensornode.

References

[1] D. R.Karuppiahetal. Softwaremodechangesfor continuousmotiontracking.In P. Robert-son,H. Shorbe,andR. Laddaga,editors,Self-Adaptive Software, volume1936of LectureNotes in Computer Science, Oxford,UK, April 17-192000.SpringerVerlag. 3

[2] T. Matsuyamaet al. Dynamicmemory:Architecturefor real time integration of visualperception,cameraaction,andnetwork communication.In Proceedings of IEEE ComputerSociety Conference on Computer Vision and Pattern Recognition, volume2, Hilton HeadIsland,SC,June2000. 2

[3] P. Greguss.Panoramicimagingblock for three-dimensionalspace.U.S.Patent4,566,763,January1986. 10

[4] I. Haritaoglu,D. Harwood,andL. S.Davis. W4: Real-timesystemfor detectionandtrack-ing peoplein 2.5d. In Proc. of the 5th European Conf. on Computer Vision, Freiburg,Germany, June1998. 2

[5] G. Holness,D. Karuppiah,S. Uppala,R. Grupen,andS. C. Ravela. A serviceparadigmfor reconfigurableagents.In Proc. of the 2nd Workshop on Infrastructure for Agents, MAS,and Scalable MAS, Montreal,Canada,May 2001.ACM. To appear. 15

[6] Z. T. Kalbarczyk,S. Bagchi,K. Whisnant,and R. K. Iyer. Chameleon:A software in-frastructurefor adaptive fault tolerance. IEEE Transactions on Parallel and DistributedSystems, 10(6):1–20,June1999. 3

[7] M. Kokar, K. Baclawski, andY. A. Eracar. Control theorybasedfoundationsof self con-trolling software. IEEE Intelligent Systems, 14(3):37–45,May 1999. 3

[8] R. Ladagga.Creatingrobust-softwarethroughself-adaptation.IEEE Intelligent Systems,14(3):26–29,May 1999. 3

[9] K. Marzullo andS.Owicki. Maintainingthetime in a distributedsystem.ACM OperatingSystems Review, 19(3):44–54,July1985.

[10] D. L. Mills. Internettimesynchronization:Thenetwork timeprotocol.IEEE Transactionson Communications, 39(10):1482–1493,October1991. 3

[11] D. L. Mills. Improvedalgorithmsfor synchronizingcomputernetwork clocks.IEEE/ACMTransactions on Networks, 3(3):245–254,June1995. 3

[12] A. Nakazawa,H. Kato,andS. Inokuchi. Humantrackingusingdistributedvision systems.In In 14th International Conference on Pattern Recognition, pages593–596,Brisbane,Aus-tralia,1998. 2

[13] Kim C.Ng,H. Ishiguro,MohanM. Trivedi,andT. Sogo.Monitoringdynamicallychangingenvironmentsby ubiquitousvisionsystem.In IEEE Workshop on Visual Surveillance, FortCollins,Colorado,June1999. 2

[14] A. Pentland.Looking at people:Sensingfor ubiquitousandwearablecomputing. IEEETransactions on Pattern Analysis and Machine Intelligence, 22(1):107–119,January2000.2, 2

[15] T. Sogo,H. Ishiguro,and Mohan.M. Trivedi. Panoramic Vision: Sensors, Theory andApplications, chapterN-OcularStereofor Real-timeHumanTracking. SpringerVerlag,2000. 2

[16] MohanM. Trivedi, K. Huang,andI. Mikic. Intelligent environmentsandactive cameranetworks. IEEE Transactions on Systems, Man and Cybernetics, October2000. 2

[17] Z. Zhu,K. DeepakRajasekar, E. Riseman,andA. Hanson.Panoramicvirtual stereovisionof cooperative mobile robotsfor localizing 3d moving objects. In Proceedings of IEEEWorkshop on Omnidirectional Vision - OMNIVIS’00, pages29–36,Hilton HeadIsland,SC,June2000. 2, 10, 10

a fault-tolerant distributed vision system architecture for ...a fault-tolerant distributed vision...

Documents