multi-camera image measurement and correspondence

11

Click here to load reader

Upload: james-black

Post on 03-Jul-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Multi-camera image measurement and correspondence

Measurement 32 (2002) 61–71www.elsevier.com/ locate /measurement

Multi-camera image measurement and correspondence*James Black, Tim Ellis

Department of Electrical, Electronic and Information Engineering, City University, London EC1V OHB, UK

Abstract

This paper describes part of a project to develop a multi-camera image tracking system. The system will be developed inthe context of a surveillance and monitoring task, principally targeted at tracking people through indoor and outdoorenvironments. Moving objects were detected for each camera viewpoint using background subtraction. Correspondencebetween viewpoints is then established by epipole line analysis. Measurements were made in 3D on each correspondedobject using a least squares estimate. The error covariance of each measurement was computed using a sensory uncertaintyfield. This information will be used to interpret the behaviour of the observed object and learn the relationships between eachcamera viewpoint and the scene under observation. 2002 Elsevier Science Ltd. All rights reserved.

Keywords: Multi-camera image tracking system; Video surveillance

1. Introduction simpler and more efficient alternative, which wehave adopted in previous work [5,6], models the

The work reported in this paper addresses some of scene, capturing variations in illumination and scenethe problems of reliably detecting and tracking geometry to support object detection through back-multiple objects moving in a complex and cluttered ground subtraction. Object identification is thenenvironment. Occlusion represents a major problem performed using size, location and context which isfor tracking objects in such scenes, and the am- sufficient for tracking in 3D using the ground planebiguity which may arise from monocular views can constraint (GPC) [7]. In previous work, we havenecessitate complex reasoning mechanisms, especial- exploited object colour [16] and kinematic con-ly in circumstances where it is difficult to reliably straints to maintain robust identification and trackingidentify the targets [1]. We use multiple viewpoints over long image sequences. One advantage of usingto improve estimation of 3D measurements and region-based object attributes such as colour is thatminimise occlusions, corresponding image features they are more reliably detected when the imagedacross the viewpoints by using geometric calibration scene contains a large range of depth, and distantof the cameras in world co-ordinates. targets cover only a small number of pixels [17,18].

One approach to the detection and tracking prob- Using geometric constraints on the scene (e.g. thelem is to fit explicit object models of shape, such as GPC), it is possible to infer the 3D location ofrigid wire-frame CAD models [2,3] or flexible active targets from monocular views [3,7,13], allowing theshape models [4], to the objects in motion [14,15]. A tracker to exploit real (rather than image plane)

motion constraints. This relies on both accurateknowledge of the camera location with respect to the*Corresponding author.

E-mail address: [email protected] (T. Ellis). scene (the extrinsic camera parameters) and the

0263-2241/02/$ – see front matter 2002 Elsevier Science Ltd. All rights reserved.PI I : S0263-2241( 01 )00050-1

Page 2: Multi-camera image measurement and correspondence

62 J. Black, T. Ellis / Measurement 32 (2002) 61 –71

spatial geometry of important features in the scene. tion based on adaptive colour templates [16]. We areFor fixed cameras, which are typical in the majority developing this occlusion reasoning in a probabilisticof video monitoring and surveillance installations, framework, combining evidence to maintain trackingcamera calibration is a straightforward and well used and target identification, propagating measures ofmethod. uncertainty through the reasoning process.

Integration of multiple viewpoints has been widelyused in the photogrammetric field to extract 3Dgeometric information, particularly for measuring 2. Detecting moving objectselevation from aerial data and constructing ‘as-built’models of engineering objects (typically rigid and Identifying moving objects in each camera viewnon-articulating) [8,9]. These measurements are was performed in two stages. The first stage involvedprincipally based on information from stationary generating a reference image which gives a repre-objects. Recent interest [10–12] has focussed on sentation of the viewpoint containing no movingextracting 3D models of non-stationary objects, such objects. The second stage was to perform an imageas people, which are non-rigid and articulating, using differencing operation to detect those regions in themultiple camera views. Moezzi et al. [11] have image where a change had occurred, and thencreated 3D graphical models for virtual reality and isolating all those regions which were likely candi-immersive video using multiple viewpoint video dates for moving objects. Background subtraction issequences. The 3D objects are computed using a a very sensitive algorithm for generating motionmethod of projecting the 2D object to a volumetric cues, though it suffers from a number of well knownrepresentation, intersecting projections from different difficulties. These are mainly related to maintainingviewpoints. Kanade [10] has used multiple cameras an accurate reference image, in the face of: unstableto capture data for encoding an actor performing. surface reflection properties of materials in the sceneSuch models can then be used to generate movement (e.g. specularities); shadows; global illuminationsequences from arbitrary viewpoints. Cai and Aggar- changes; insufficient contrast between object andwal [12] employed multiple camera views for sur- background.veillance purposes, using a set of feature pointsextracted from the medial axis of an object extracted 2.1. Reference image generationusing background subtraction. This is combined withmeasures of position and velocity of a bounding box Reference images were used to isolate stationaryin a multivariate Gaussian model to identify most objects within the scene being observed. They are alikely matches for tracking objects. powerful tool for detecting moving objects. Other

In complex and cluttered environments with even researchers [16,19,21] have used this method tomoderate numbers of moving objects (e.g. 5–20) the generate areas of interest and isolate moving objectsproblem of tracking individual objects is significantly in their tracking systems. The reference image wascomplicated by occlusions in the scene, where the initialised to the first frame in the image sequence.object will partially or totally disappear from the The reference image is updated by using thevisible field of view for both short or extended current image frame and the previous referenceperiods of time. Static occlusion results from objects image. Koller [19] used a similar method for main-moving behind (with respect to the camera) fixed taining a reference image for analysis of trafficelements in the scene (e.g. walls, tables), whilst scenes. The detail of this method is explained in Ref.dynamic occlusion occurs as a result of moving [23]. The reference image is updated by performingobjects in the scene occluding each other, where the following computations:targets may merge or separate (e.g. a group of people

R (x) 5 R (x) 1 (a (1 2 M (x)) 1 a M (x))Dt t21 1 t21 2 t21 t(x)walking together). Kalman filtering provides a mech-anism for spatio-temporal reasoning to minimise the (1)ambiguities resulting from occlusion by utilisingtrack prediction; we have also used colour identifica- D (x) 5 F (x) 2 R (x) (2)t t t21

Page 3: Multi-camera image measurement and correspondence

J. Black, T. Ellis / Measurement 32 (2002) 61 –71 63

T1(x) 5 mR (x) 1 T (3) the different viewpoints. In addition, we extract 3Dt21 0

measurements of object location and height as part ofthe correspondence operation. The 3D measurementsM (x) 5 h1 if uD (x)u . T1(x), 0 if uD (x)u # T1(x)j (4)t t tare used in a variety of ways: the height measure-

where R (x) is the reference image for image framet21 ments could be used to interpret behaviour in thet 2 1, F (x) corresponds to the current image frame att case of a human object; to identify a region oftime t, and T1(x) is a threshold image which is used overlap between two or more cameras; to identifyto identify foreground pixels. occlusion planes in the scene.

For the image sequences in this paper the follow-ing values were used for the parameters in the 3.1. Epipolar plane constraintreference image update process: a 5 0.1, a 5 0.01,1 2

T 5 2.0, and m varied from 0.234 to 0.293.0 The epipole plane constraint was used to defineThe first two parameters control the influence of the correspondence of objects over multiple view-

the current frame during the update of the reference points. Camera calibration played an important roleimage. A larger weighting value is given to pixels in this process, since it was used to construct thethat have been classified as part of the background. epipole lines which were used for correspondenceThe remaining two parameters are used to construct analysis. The implementation of Tsai’s method [20]a threshold image based on the reference image. developed by Wilson was integrated into the current

system to perform camera calibration.2.2. Image differencing process It is possible to map any image point to a 3D line

in the world coordinate space. This allows twoPerforming an image differencing process between feature points from different viewpoints to be put

the current frame and the reference image results in into 3D correspondence. Thus, in Fig. 1 the point man image which (potentially) contains moving ob- on image plane 1 is constrained to lie on the epipolejects of interest. A binary image containing the line ep11 in the image plane of camera 2.moving objects is created by performing a thres-holding operation. The threshold image is created by 3.2. Object correspondenceapplying a global weighting value and offset to thereference image. This operation ensures that the To provide an additional constraint for the epipolethreshold value is set appropriately based upon the line analysis the bounding box of each moving objectintensity of the pixel in the reference image. Hence, was computed to facilitate the matching process.regions of low intensity would have a smaller The object correspondence algorithm had thethreshold value compared to regions of high intensi- following structure:ty. The image differencing process is the same as Eq.(4), except D (x) and M (x) are based on F (x) and 1. For each moving object detectedt t t

R (x); further details can be found in Ref. [23]. 1.1 Derive the bounding box of object.t

A morphological filter was used to minimise the 1.2 Generate epipole lines through the cen-break-up of the binary blobs, which occurs when the troid of the moving object for the othercontrast between object and background is low. The viewpoints.morphological operation consisted of two closingoperations followed by an opening, using a 333 2. For each object pair Ai, Bj (i and j are the camerastructure element with all the entries set to one. viewpoints)

2.1 If epipole line j of object Ai intersectswith the bounding box line segment GPHP

3. Correspondence and 3D measurements of Bj, and if epipole line i of object Bjintersects with the bounding box line seg-

The next stage in the processing is to identify ment GPHP of Ai then update the corre-corresponding blobs generated by the same object in spondence information.

Page 4: Multi-camera image measurement and correspondence

64 J. Black, T. Ellis / Measurement 32 (2002) 61 –71

Fig. 1. Geometric view of the epipole plane. Es1 and Es2 are the perspective centres of image planes 1 and 2, respectively.

By using epipole line analysis alone there was thepotential for many ambiguous matches in a dynamicenvironment with many moving objects. Two addi-tional conditions were added to constrain the numberof matches. The first constraint was that the con-structed epipole lines were limited to have a heightin the range 0–2 m above the ground plane. Thisconstraint was applied to prevent ambiguous matchesoccurring which were unlikely, i.e. where the cen-troid of the object was greater than 2 m. The secondwas that, for two objects to be considered matches,the bounding box of each object should intersectwith the epipole line of the respective object.

3.3. Height estimation

When tracking people in cluttered scenes, the top Fig. 2. Bounding box of object. BB0 . . . BB3 define the boundingof the head is much more likely to be seen than the box of the object. C is the centroid of the shaded object. The line

segment (C–GP) represents an image vector which is perpen-feet, because of static occlusions (e.g. chairs, tables,dicular to the ground plane. HP and GP refer to the top and theetc.). In order to calculate the height of each object itbottom of the object, respectively.was necessary to reconstruct 3D lines projected

through the mid-point of the top of the objectsbounding box, the point HP in Fig. 2.

p to the line r , assuming that the direction vector bi iGiven a set of N 3D linesis a unit vector, then we have

r 5 a 1 l bi i i i2 2 2d 5 up 2 a u 2 ((p 2 a ) ? b ) .i i i i

Ta point p 5 (x,y,z) must be evaluated which mini-Figure 3 provides an explanation of the error mea-mises the error measuresure from a geometric viewpoint.

N Evaluating the partial derivatives of the summa-2 22j 5O d .i tion of all d with respect to x, y and z results in theii51

equation for computing the least squares estimate ofWhere d is the perpendicular distance from the point p:i

Page 5: Multi-camera image measurement and correspondence

J. Black, T. Ellis / Measurement 32 (2002) 61 –71 65

KP 5 C21⇒ P 5 K C

The point p can now be evaluated by solving thesum of the partial derivatives for the N 3D lines. A3D line intersection algorithm was used to find theoptimal height point in the least squares sense.

4. Sensory uncertainty field

The least squares estimate (LSE) of the top ofFig. 3. Geometric view of the minimum discrepancy. each moving object was evaluated using the method

discussed in the previous section. Using the SensoryN N Uncertainty Field (SUF) it was possible to compute

2 2 2 2j 5O d 5O hup 2 a u 2 ((p 2 a ) ? b ) j the uncertainty of the 3D measurement. In Ref. [22],i i i i

i51 i51 Olsen used the SUF to compute the uncertainty ofN each ground plane point within the visible field of

2 2 2 2 view of a group of cameras. For the purposes of thisj 5O h(x 2 a ) 1 ( y 2 a ) 1 (z 2 a )ix iy izi51 paper this method was extended to give the uncer-

22 ((x 2 a )b 1 ( y 2 a )b 1 (z 2 a )b ) j tainty of a point lying on a plane Z 5 Zh, where Zhix ix iy iy iz iz

was the LSE for the top of the moving object (Z 5 02 N

≠j is the ground plane).]5O h2(x 2 a ) 2 2(p 2 a ) ? b b jix i i ix Computing the uncertainty was necessary because≠x i51

it gives a degree of confidence of each 3D measure-Similarly, ment. This made it possible to quantify the level of

accuracy of measurements in certain regions of the2 N≠j scene, and the error bounds on the object’s location]5O h2( y 2 a ) 2 2(p 2 a ) ? b b jiy i i iy≠y i51 could be constrained appropriately.

2 N≠j 4.1. Covariance propagation]5O h2(z 2 a ) 2 2(p 2 a ) ? b b jiz i i iz≠z i51

To derive the uncertainty of the 3D measurement2 2 2≠j ≠j ≠j it was necessary to propagate the covariance from] ] ]1 1 5 0≠x ≠y ≠z the image space to the object space. This is achieved

by formulating two functions which define how anUsing matrix notation an equation can be derived toimage point (x,y) is mapped to a 3D object spaceminimise the error function for all N lines:point (X,Y,Zh), for a given height Zh.

N N N2 Zh is the estimated height of the object computedO 1 2 b O 2 b b O 2 b bi51 ix i51 ix iy i51 ix izby the method described in the previous section.N N N2 -O 2 b b O 1 2 b O 2 b b Taking the partial derivatives of these two functionsi51 ix iy i51 iy i51 iy iz3 4N N N we can then derive the Jacobian matrix. The uncer-2O 2 b b O 2 b b O 1 2 bi51 ix iz i51 iy iz i51 iz tainty of the 3D measurement for the given camera

N viewpoint isx O a 2 b a ? bi51 ix ix i iTN S 5 JLJ

5y O a 2 b a ? bi51 iy iy i i

where L is the estimate of the image covariance of3 4 3 4Nz O a 2 b a ? bi51 iz iz i i the given camera viewpoint. The full derivation of

Page 6: Multi-camera image measurement and correspondence

66 J. Black, T. Ellis / Measurement 32 (2002) 61 –71

the coefficients of the matrix can be found in Ref. 9w 1i]]] ]]]9[22]. w 5 and w 5i N i trace(S )i9O wi51 i

4.2. Covariance fusion Covariance intersection [24] has been appliedsuccessfully to decentralised estimation applications,

The image space covariance is propagated for each where sensors may partially observe the state of acamera which was used to make the 3D measure- tracked object. The covariance intersection equationsment of the moving object. This results in a set of are similar to the covariance accumulation equationsobject space covariance matrices. except a weighting term is introduced. Non-uniform

These covariance matrices must now be combined, weighting is applied, as described above, so thator fused, to arrive at an optimal estimate of the preference is given to those estimates which have auncertainty of the 3D measurement point. Two smaller trace value. The difference between the twomethods were explored: covariance accumulation approaches can be seen in the plot of two covarianceand covariance intersection. The equations for co- ellipses fused by each method (Fig. 4). It can bevariance accumulation are observed that fusion by accumulation gives a more

optimistic result than fusion by intersection.21 21 21 21C 5 (S 1 S 1 ? ? ? 1 S )1 2 N

where S is the result of propagating the imagei5. Experimental methods and resultscovariance to object space for each camera viewpoint

which was used to make the 3D measurement, and CThe system has been evaluated on two imageis the single distribution of uncertainty as a result of

sequences using a configuration of four cameras. Thefusing the covariance of each camera viewpoint.first was a 101 frame sequence of a toy car movingThe equations for the covariance intersection arethrough a laboratory. The second was a short (20

21 21 21 21 frame) sequence of a person moving through anC 5 (w S 1 w S 1 ? ? ? 1 w S )1 1 2 2 N Noffice environment. The inter-frame sampling time

where was approximately 2 s for both sequences. Both of

Fig. 4. (a) Covariance fusion by accumulation. (b) Covariance fusion by intersection.

Page 7: Multi-camera image measurement and correspondence

J. Black, T. Ellis / Measurement 32 (2002) 61 –71 67

Fig. 5. Frames 7–11 of the person image sequence.

the sequences contained instances of the object Each time two objects are matched the corre-entering and leaving the field of view of each camera spondence information is updated. At the end of theviewpoint. Fig. 5 shows some sample images taken correspondence process each set of matched objectsfrom the second sequence. over two or more camera viewpoints are available to

Each camera was calibrated using Tsai’s [20] infer 3D information.method for coplanar calibration. Landmark features The object detection process reliably segmentedwere added to each of the environments and then moving objects under varying lighting conditions,surveyed. Manual measurements of the image but occasionally contained shadows. Frames 13 andcoordinates were then matched against the survey 14 in Fig. 6b show examples of these cases, wheremeasurements. For the first image sequence the mean the lower part of the segmented object is not wellsquare object space error was 33.78, 81.62, 83.64, defined due to a trailing shadow. The initial results

2and 185.83 mm for camera viewpoints one to four, from the object detection process did not reliablyrespectively. For the second image sequence the detect the lower body of the moving object becausemean square object space error was 33.81, 17.73, it had a similar intensity to the background. This

26.2, and 7.58 mm for camera viewpoints one to resulted in the moving object being split and sepa-four, respectively. What follows is a summary of the rated during the image labelling. This problem wassequence of processing used on both image se- overcome by using a morphological filter to mergequences: the upper and lower body of the object.

Figure 7 gives a graphical depiction of the epipole1. Update the reference image of each camera lines generated for each camera viewpoint. The

viewpoint. segments below each bounding box indicate the2. Perform image differencing to isolate moving presence of a ground plane region, or occlusion

objects. plane region. This was achieved by backprojecting3. For each detected object the ground plane location of the object onto the

3.1 Compute the bounding box information. image plane. If the backprojected point was outside3.2 Generate the epipole lines for the other the object’s bounding box then an occlusion planecamera viewpoints. region could be identified.

Figure 9a shows a measure of the measurement4. Use epipole line analysis to correspond objects, as uncertainty (represented by the trace of the covar-

described in Section 3.2. iance matrix) for all the measurements shown in Fig.5. For each set of objects corresponded 8. It can be observed in Fig. 9a that there were

5.1 Compute the uncertainty of the object several measurements with uncertainties which werelocation using SUF. larger by several orders of magnitude compared to5.2 Detect the occlusion plane or ground the other measurements. These cases were related toplane region based on whether the point the second and fourth camera viewpoints. The toywhere the object touches the ground plane is car was observed at a large distance from bothvisible. cameras, compared to the other observations, which

Page 8: Multi-camera image measurement and correspondence

68 J. Black, T. Ellis / Measurement 32 (2002) 61 –71

Fig. 6. Sample results of the moving object detection process. (a) Toy car image sequence accumulated over all image frames. (b) Personimage sequence accumulated over frames 11–15, viewed from left to right.

Fig. 7. The bounding boxes for each moving object along with their epipole lines.

accounts for the significant increase of the uncertain- the mean trace (covariance accumulation) was 38.08,ty of the measurements. 22.81, and 9.67 for two, three, and four cameras

The uncertainty is dependent on the number of being used to make the 3D measurement, respective-cameras used to make the 3D measurement. The ly. For both image sequences the uncertainty ex-mean trace (covariance accumulation) for sequence hibited a downwards trend with an increase in the

4one was 2.49 3 10 , 419.18, and 410.62 for two, number of cameras used to make the 3D measure-three, and four cameras being used to make the 3D ment, as expected.measurements, respectively. In image sequence two The height measurements of the person sequence

Page 9: Multi-camera image measurement and correspondence

J. Black, T. Ellis / Measurement 32 (2002) 61 –71 69

Fig. 8. (a) LSE plot for the centroid and height of the toy car. (b) Ground plane location plot of the toy car.

Fig. 9. (a) Uncertainty of the 3D measurements for the toy car sequence. (b) Same plot as (a), except the outliers are removed.

were consistent with the events in the scene. The small discrepancy can be attributed to the errors ofmean height of the person was found to be 1.76 m. the detection of the top of the object and theThis excluded the outlier measurements in frames 1, calibration information. However, the accuracy of8, 9, 10, and 11 where the person was bending over the measurements is sufficient to interpret objectand crouched. The actual height of the person was behaviour. In Fig. 10a the height can be seen to fallapproximately 1.8 m (including hat and shoes). This from frames 7 to 10, and rise again at frame 11. This

Page 10: Multi-camera image measurement and correspondence

70 J. Black, T. Ellis / Measurement 32 (2002) 61 –71

Fig. 10. (a) LSE plot for the centroid and height of the person sequence. (b) Uncertainty plot of the 3D measurements of the personsequence.

captures the event of the person bending over, The calibration algorithm was successfully used tocrouching and standing up again. calibrate the network of four cameras overlooking a

It can also be observed that the trace of the common scene.uncertainty for covariance accumulation is generally The image correspondence approach provided asmaller than that of covariance intersection. This fast and efficient mechanism to match objects acrossindicates that covariance accumulation generally multiple viewpoints. The person image sequence hadgives more optimistic estimates of uncertainty than several instances of the object being partially oc-covariance intersection. Hence, in situations where cluded and in full visibility within the field of viewthe covariance is too optimistic this could result in of each camera, but this did not affect the per-divergence when tracking in a Kalman filter frame- formance of the matching or height estimation. Thework. value of detecting occlusion is that it facilitates the

Covariance intersection should be used in a de- learning of occlusion plane regions for each cameracentralised Kalman filter tracking framework where viewpoint that provides information on the scene. Ansensors are used for estimation of different properties image sequence where objects move behind and inof the object state, as is suggested in Ref. [24]. In front of occlusion planes could be used to constrainaddition, the non-uniform weighting used in co- the occlusion plane region position in 3D based onvariance intersection makes practical sense, since the location and uncertainty of each 3D measure-preference is given to covariance matrices that have ment.a higher degree of confidence. The system will be augmented to cope with

several objects through the scene. There are severalanticipated problems which will need to be over-

6. Conclusion and future work come: the increased probability of object corre-spondence errors, occlusion between moving objects,

The system that has been implemented clearly tracking objects between successive frames anddemonstrates it is possible to extract reliable mea- coping with the changes in shape of each object. Thesurements of height and ground plane location by current method can be used to identify a list ofusing multiple cameras with overlapping viewpoints. candidate objects that may potentially be in corre-

Page 11: Multi-camera image measurement and correspondence

J. Black, T. Ellis / Measurement 32 (2002) 61 –71 71

mated, 3-D measurement using multiple CCD camera views,spondence. The consistency of the top of the objectPhotogram. Rec. XV (86) (1994) 315–322.over the corresponded viewpoints could be used to

[10] T. Kanade, P.J. Narayanan, P.W. Rander, Virtualized reality,filter ambiguous matches. in: Proceedings of the 15th International Conference on

The SUF was used to measure the degree of Display Technology, Japan, 1995.uncertainty of the location of each object. Computing [11] S. Moezzi, A. Katkere, D.Y. Kuramura, R. Jain, Reality

modelling and visualisation from multiple video sequences,the uncertainty of each 3D measurement was im-IEEE Comput. Graph. Applic. 16 (6) (1996) 58–63.portant because it gives the system a quantifiable

[12] Q. Cai, J.K. Aggarwal, Tracking human motion usingvalue that indicates the degree of confidence in each multiple cameras, in: Proceedings ICPR96, Vienna, 1996, pp.measurement. The uncertainty is reduced with an 68–72.increase in the number of cameras used to make each [13] R. Fraile, S.J. Maybank, Vehicle trajectory approximation

and classification, in: Proceedings BMVC’98, Southampton,3D measurement. The SUF will be extended to1998, pp. 832–840.include the uncertainty of the height of each object;

[14] N. Johnson, D.C. Hogg, Learning the distribution of objectthis will provide a mechanism to assess the accuracy trajectories for event recognition, in: Proceedings BMVC95,of the height measurement before inferring the Birmingham, 1995, pp. 583–592.behaviour of the object. [15] R.J. Morris, D.C. Hogg, Statistical models of object inter-

action, in: Proceedings of an IEEE Workshop on VisualSurveillance, Bombay, 1998.

[16] S. Brock-Gunn, T.J. Ellis, Using colour templates for targetReferences identification and tracking, in: D.C. Hogg, X. Boyle (Eds.),

Proc. BMVC92, 1992, pp. 207–216.[1] H. Buxton, S. Gong, Advanced visual surveillance using [17] T.J. Ellis, P.L. Rosin, P. Moukas, P. Golton, A knowledge-

bayesian networks, in: Proceedings of a Workshop on based approach to automatic alarm interpretation usingContext-based Vision, Cambridge, MA, IEEE Computer computer vision on images sequences, in: InternationalSociety Press, 1995, pp. 111–123. Carnahan Conference on Security Technology, Zurich, 1989.

[2] M. Frank, H. Haag, H.-H. Kollnig Nagel, Tracking of [18] P.L. Rosin, T. Ellis, Detecting and classifying intruders inoccluded vehicles in traffic scenes, in: Proceedings image sequences, in: Proceedings BMVC91, Glasgow, 1991,ECCV’96, Cambridge, UK, LNCS Series, Vol. 1065, 1996, pp. 293–300.pp. 485–494. [19] D. Koller, J. Weber, J. Malik, Robust multiple car tracking

[3] D. Koller, K. Daniilidis, H.-H. Nagel, Model-based tracking with occlusion reasoning, in: ECCV1994, 1994, pp. 186–in monocular image sequences of road traffic scenes, Int. J. 196.Comput. Vis. 10 (3) (1993) 257–281. [20] T. Roger, An efficient and accurate camera calibration

[4] A. Baumberg, D. Hogg, Learning flexible models from technique for 3D machine vision, in: Proceedings CVPR ’86,image sequences, in: Proceedings ECCV’94, Stockholm, IEEE, 1986, pp. 323–344.LNCS Series, Vol. 800, 1994, pp. 299–308. [21] M. Teal, T.J. Ellis, Spatial-temporal reasoning based on

[5] T.J. Ellis, P.L. Rosin, P. Golton, Model-based vision for object motion, in: British Machine Vision Conference, 1996,automatic alarm interpretation, IEEE Aerospace Electron. pp. 465–474.Syst. Mag. 6 (3) (1991) 14–20. [22] B. Olsen, Robot navigation using a sensor network, Masters

[6] P.L. Rosin, Thresholding for change detection, in: Proceed- Thesis, AAlborg University Institute of Electronic Systems,ings BMVC, 1997, pp. 212–221. Laboratory of Image Analysis, 1998.

[7] G.D. Sullivan, Model-based vision for traffic scenes using [23] J. Black, T.J. Ellis, Multi camera image tracking, Technicalthe ground-plane constraint, in: D. Terzopoulos, C. Brown Report No. MVG99/1, Dept. EEIE, City University, Lon-(Eds.), Real-time Computer Vision, Cambridge University don, 1999.Press, 1994. [24] S. Julier, J.K. Uhlmann, A non-divergent estimation algo-

[8] T.A. Clarke, T.J. Ellis, S. Robson, High accuracy 3D rithm in the presence of unknown correlations, in: Americanmeasurement using multiple camera views, in: IEE Col- Control Conference, 1997, http: / /www.ait.nrl.navy.mil /peo-loquium Digest No. 1994/054, 1994. ple /uhlmann/CovInt.html.

[9] T.A. Clarke, M.A.R. Cooper, J. Chen, S. Robson, S. Auto-