ces seminar, mathcces and ika, rwth aachen, … · his seminar paper presents an approach to guide...

6
CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 1 Vehicle Guidance System for Scaled Vehicles through Simple Road Networks Based on External Camera Information Karsten Behrendt Abstract—This paper develops a framework for analyzing scaled road maps and computer based remote controlling of vehicles solely based on external cameras. There are three main aspects which are required for extracting a road network out of multiple camera images and calculating trajectories. The first is the positioning of a camera network, covering the complete environment, which allows a scene analyzation solely based on computer vision. The second represents a sequence of steps to detect, extract, analyze and store a road-map from captured cam- era images. The third contribution is a method to track dynamic objects in real-time and place those objects within our road-map, to be used for trajectory planning. Position and orientation of the remote controlled vehicle need to be continuously supervised, so that deviations from planned trajectories can be detected and corrected in real-time. Communication between single steps follows a strict interface to maintain the possibility of exchanging methods within the framework. The framework is implemented and tested for a 1:24 scaled environment. The intended use for remote controlling scaled vehicles is to provide a solution for testing, monitoring and improving automated vehicles in a safe way. The specific motivation for this project is to be able to test external vehicle guidance solely based on digital camera images. Index Terms—Computer Vision, Road Analysis, Object Track- ing, Remote Control I. I NTRODUCTION T HIS seminar paper presents an approach to guide remote controlled cars on road maps based on external digital cameras. After an initialization and calibration of all cameras, the environment is analyzed and based on that scaled optimal trajectories are calculated for the road map. Currently, multiple approaches for automated driving are being researched. Automated vehicles are equipped with a vast amount of sensors, such as laser range scanners, cameras, infrared sensors, inertial measurement units and GPS systems. Unfortunately, there are environments in which certain sensors do not work reliably such that the sensor network should be extended by external cameras and guidance. A camera network can aid an automated car’s sensor network by adding information that the car cannot gather on its own, for exam- ple in environments with limited visibility due to occlusion. Knowledge of objects behind obstructions can lead to an increase of speed as well as insure the safety of pedestrians. In addition to the aspect of limited visibility, an external guidance system may have accurate maps, where the vehicles may have K. Behrendt is a student at RWTH Aachen. This project is being realized by the Insitute of Automotive Engineering, RWTH Aachen, Germany none at all. An external guidance system is able to completely remove the need for a map within the car. To achieve this enhancement, a camera network localizes the vehicle and analyzes the vehicle’s environment including dynamic objects to safely guide it. Possible scenarios include tunnels without GPS signal, parking spaces with limited view, private property and environments that are subject to frequent changes. External guidance can also be useful for unstructured environments, with few predefined rules, driving lanes or parking spaces, therefore allowing dynamically maneuvering through and allocating free space. With an external guidance system, the need for lanes, predefined parking spaces and other static markers can be removed. Additionally this framework allows to test driving maneuvers and path planning in a safe manner. Since it is implemented for scaled vehicles and therefore quite cheap, errors in trajectory planning cannot lead to high cost or injuries. Such a running framework also allows students to gain hands on experience in path planning and may lead to new approaches for automated vehicles. Instead of remote controlling the vehicles, this system can also be used to send vital environment information to the car. This can for instance be done through wireless signals. For non-automated vehicles such a system could send warnings or information to drivers by dynamically changing signals or even by projecting markers onto driving lanes. II. RELATED WORK Automated driving is being researched by several leading car manufacturers, suppliers, start-ups and universities. To the best of the authors knowledge there are no papers that attempted to automate scaled vehicles based on external cam- eras. [1] tested localizing vehicles in parking garages by placing two cameras in the environment. They report a mean deviation of 0.37m during their tests, which is too large to be used for vehicle guidance within a parking garage. Their reported error may be reduced significantly by putting more effort into the placement of the camera system. In an earlier approach [2] tested environment-embedded lidar sensors with a mean accuracy of about 0.085m, which would be precise enough for some cases, but did not present a way to use this information. Finding road networks based on aerial images has been researched by quite a few teams such as [3], [4] and [5]. While Christmas and Barzohar focus on probabilistic shape analysis after image fusion, Tasi concentrates on the underlying color

Upload: buidung

Post on 03-Jul-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 1

Vehicle Guidance System for Scaled Vehiclesthrough Simple Road Networks Based on External

Camera InformationKarsten Behrendt

Abstract—This paper develops a framework for analyzingscaled road maps and computer based remote controlling ofvehicles solely based on external cameras. There are three mainaspects which are required for extracting a road network outof multiple camera images and calculating trajectories. The firstis the positioning of a camera network, covering the completeenvironment, which allows a scene analyzation solely based oncomputer vision. The second represents a sequence of steps todetect, extract, analyze and store a road-map from captured cam-era images. The third contribution is a method to track dynamicobjects in real-time and place those objects within our road-map,to be used for trajectory planning. Position and orientation ofthe remote controlled vehicle need to be continuously supervised,so that deviations from planned trajectories can be detectedand corrected in real-time. Communication between single stepsfollows a strict interface to maintain the possibility of exchangingmethods within the framework. The framework is implementedand tested for a 1:24 scaled environment. The intended use forremote controlling scaled vehicles is to provide a solution fortesting, monitoring and improving automated vehicles in a safeway. The specific motivation for this project is to be able to testexternal vehicle guidance solely based on digital camera images.

Index Terms—Computer Vision, Road Analysis, Object Track-ing, Remote Control

I. INTRODUCTION

THIS seminar paper presents an approach to guide remotecontrolled cars on road maps based on external digital

cameras. After an initialization and calibration of all cameras,the environment is analyzed and based on that scaled optimaltrajectories are calculated for the road map.

Currently, multiple approaches for automated driving arebeing researched. Automated vehicles are equipped with avast amount of sensors, such as laser range scanners, cameras,infrared sensors, inertial measurement units and GPS systems.Unfortunately, there are environments in which certain sensorsdo not work reliably such that the sensor network shouldbe extended by external cameras and guidance. A cameranetwork can aid an automated car’s sensor network by addinginformation that the car cannot gather on its own, for exam-ple in environments with limited visibility due to occlusion.Knowledge of objects behind obstructions can lead to anincrease of speed as well as insure the safety of pedestrians. Inaddition to the aspect of limited visibility, an external guidancesystem may have accurate maps, where the vehicles may have

K. Behrendt is a student at RWTH Aachen. This project is being realizedby the Insitute of Automotive Engineering, RWTH Aachen, Germany

none at all. An external guidance system is able to completelyremove the need for a map within the car.

To achieve this enhancement, a camera network localizesthe vehicle and analyzes the vehicle’s environment includingdynamic objects to safely guide it. Possible scenarios includetunnels without GPS signal, parking spaces with limited view,private property and environments that are subject to frequentchanges. External guidance can also be useful for unstructuredenvironments, with few predefined rules, driving lanes orparking spaces, therefore allowing dynamically maneuveringthrough and allocating free space. With an external guidancesystem, the need for lanes, predefined parking spaces and otherstatic markers can be removed. Additionally this frameworkallows to test driving maneuvers and path planning in asafe manner. Since it is implemented for scaled vehicles andtherefore quite cheap, errors in trajectory planning cannot leadto high cost or injuries. Such a running framework also allowsstudents to gain hands on experience in path planning and maylead to new approaches for automated vehicles.

Instead of remote controlling the vehicles, this system canalso be used to send vital environment information to the car.This can for instance be done through wireless signals. Fornon-automated vehicles such a system could send warningsor information to drivers by dynamically changing signals oreven by projecting markers onto driving lanes.

II. RELATED WORK

Automated driving is being researched by several leadingcar manufacturers, suppliers, start-ups and universities. Tothe best of the authors knowledge there are no papers thatattempted to automate scaled vehicles based on external cam-eras.

[1] tested localizing vehicles in parking garages by placingtwo cameras in the environment. They report a mean deviationof 0.37m during their tests, which is too large to be usedfor vehicle guidance within a parking garage. Their reportederror may be reduced significantly by putting more effort intothe placement of the camera system. In an earlier approach[2] tested environment-embedded lidar sensors with a meanaccuracy of about 0.085m, which would be precise enough forsome cases, but did not present a way to use this information.

Finding road networks based on aerial images has beenresearched by quite a few teams such as [3], [4] and [5]. WhileChristmas and Barzohar focus on probabilistic shape analysisafter image fusion, Tasi concentrates on the underlying color

CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 2

(a) Matched keypoints (b) Occlusion of Keypoints (c) Matched Result

Fig. 1. Left: Features are only extracted within overlapping region. Most feature points are not matched correctly due to local similarity of dashed lanemarkers. Middle: One half of the very similar keypoints are occluded which results in a smooth matching. Right: Resulting panorama image of two cameraframes.

models and removing disturbances such as lighting or shadowsto detect roads. All of their approaches work well for the imagequality they were working with.

Automated vehicles and small robots are both topics ofgreat interest in the research community. Applications canbe found in home service, farming, line following or otherrobots. Most localization approaches rely on on-board sensorsby using simultaneous localization and mapping as proposedby [6], Sonar or similar [7], or embedded vision [8]. Up tonow external camera information is rarely used for navigationexcept for industrial placing robots and military equipmentsuch as smart missiles [9].

III. DESCRIPTION OF WORK

A. General idea behind framework

Given an environment within which automated cars’ sensorsdo not offer high enough confidence, an external camerasystem is to be set up for remote controlling such vehicles.For initial testing, this is implemented for scaled vehicles.A camera network is setup such that the complete scene iscovered. Based on that camera network, dynamic and staticobjects have to be perceived, analyzed and sent to the planningalgorithm by using a defined interface. All steps from settingup the camera over the communication between planner andperception up to the communication of planner with thevehicle have to be clearly defined.

1) Framework: The proposed framework is built in a mod-ular way allowing each part to be replaced by environment-specific algorithm without additional overhead. For a completeenvironment perception cameras need to be placed such thatevery spot within the scene is visible on at least one dig-ital image. Those images need to be combined to create amap of the entire scene within which the vehicles will beguided. Using that fused image material the scene needs tobe analyzed, the environment needs to be divided into roadsegments, inaccessible parts and dynamic objects. Within theroad segments, lane markers need to be detected to be able tocalculate lane centers. While most permanently inaccessible

areas and optimal paths can be perceived from the beginning,trajectories may only be computable during runtime, becauseof dynamic objects. For storing this information we proposea straight-forward model for road information based on linesegments, clothoids and circle segments.

In addition to a map, low latency updates of all vehiclepositions are needed. A proposed tracker yields location, sizeand orientation of vehicles in the scene. The combination oftracker output, stored map and location of dynamic objects isused by the planner to generate trajectories. Changes in thevehicles’ locations and orientations are recognized with lowlatency by the cameras and again processed by the planner.

The following sections explain all steps in more detail.

B. Camera installation and image stitching

1) Camera installation: A system of static cameras needsto be installed such that every spot is visible within at leastone of the camera images. For an easier projection, at leastone camera should be pointing vertically onto the scene toallow that camera’s image to be used as a reference frame.Otherwise the cameras have to be calibrated to the scene andtheir images need to be mapped onto the surface. There needsto be a minimum number of pixels per m2 to confidently detectthe static and dynamic objects’ location with low error. For aless complicated mapping purely vertically placed cameras arerecommended since they provide aerial images, from whichthe position of an object can be determined in a straight-forward manner. Given non-orthogonal views onto the surface,the images are, in addition to stitching, projected onto a fittingimage plane as described by [10].

2) Image Stitching: For the environment analysis the im-ages of all cameras are combined into a single one. Oncompletion of this step, the complete scene is stored within oneaerial image. This image is used for the analysis of the sceneand creation of the road-map. There are several approaches ofregistering multiple images respectively, quite a few of whichare summarized by Szeliski [10]. One method that can be

CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 3

applied to images without any orientation information is tomatch features within different images.

First of all, interest points (keypoints) need to be detected.This is possible by finding edges (Canny [11]), corners (Harris[12]), blobs (laplacian of gaussian (LoG) [13]), or otherfeatures. All of those keypoints can be found by varyingoperators, some of which are described by [12]. For each de-tected feature a descriptor such as the Scale-invariant featuretransform descriptor (SIFT) [13] or the speeded up robustfeatures descriptor [12] has to be calculated. Both featuredescriptors store a number of weighted gradient directions thatcan be matched in later stages.

Matching descriptors of perfect road networks representsa non-trivial challenge, since dashed lines and other lanemarkers create very prominent keypoints and descriptors re-spectively, that are often mapped to incorrect feature points.That is because locally, multiple components such as lanemarkers are almost identical as can be seen in figure (1a). Theimage matching procedure can, for example, be simplified byplacing distinct features within the scene and using only thoseto register images. In this specific case, reasonable matchescan be calculated by occluding feature points that tend to bematched incorrectly or by defining special regions of interest.A combination of both is displayed in figure (1b).

With the calculated list of matched keypoints, some ofwhich are most likely going to be incorrect, a homography(projection into reference image) matrix is calculated. Thehomography is applied to all homogenous coordinates p ofthe non-reference image, which in addition to their x and ycoordinate have an entry for scaling w. Conversion betweenhomogeneous coordinates and cartesian is done by dividingthrough a points weight [14]. Due to homogeneous coordinatesbeing three dimensional, the homography matrix h is a 3x3matrix. This allows projection, rotation and translation of apoint by one matrix multiplication.

p =

xyw

h =

sx k1 txk2 sy typ1 p2 sw

(1)

The homography matrix h can be divided into rigid, similarityand projective transforms. Translation and rotation, the rigidtransformations, can be applied by setting the values in the firsttwo rows. tx and ty describe the translation, while sx, k1, k2and sy set the rotation. In the two dimensional case only therotation around the z-axis

Rz =

cosφ sinφ 0− sinφ cosφ 0

0 0 1

(2)

is of interest. k1 and k2 alone, apply a skew to images, which ispart of the similarity transforms. sx and sy scale entries alongtheir respective axes, while sw scales both and is thereforeredundant, so it can be set to one. Projective transforms can beapplied by setting p1 and p2, which is described by [14]. It ispossible to determine translation, rotation, skew, and projectivetransforms on their own. The respective matrices are thenmultiplied to receive the homography. Another approach isto calculate the complete matrix in one step, which is morestraight-forward given enough matched features.

Direct calculation of the homography is done by using ran-dom sample consensus (RANSAC) as described by Fitzgibbon[15]. A sample of matches is taken from which a projectionis calculated. Taking only the keypoints that fit the initialhomography, a new homography is calculated. This approachcan be performed iteratively and yields a consensus of multiplematches. The homography matrix itself can be calculated byminimizing the least squares error of an overdetermined linearsystem as described by Dahmen and Reussken [16]. If thereare not enough keypoints detected or good matches found, thismethod cannot be applied.

By multiplying each pixel of every image by its homog-raphy, all images are mapped to reference coordinates. Theresult is a panoramic view of all images as depicted in figure(1c).

C. Roadmap analysis, partitioning and storing

Roadmap analysis depends on the given environment toa large extent. It is obvious that a parking garage requiresdifferent algorithms, than suburban streets. For each case thescene needs to be divided into accessible and inaccessiblespace (road or non-road). For the road-segments several rulesand properties such as speed limit, direction, and other trafficrules need to be detected and stored. Inaccessible space ismodeled implicitly by not explicitly being defined as part ofthe road.

There are several approaches for detecting road-segments.One possibility is to randomly select small regions of pixelsfrom camera images and decide by predefined criteria whetheror not that region is part of the road or not. To calculate thesurrounding road, region growing, as describe by [17], can beused until either color changes are too significant or an edgeis detected. The critical aspect of this approach is the selectedstopping criterion [5].

Another approach is to perform segmentation on the inputimages and decide for each segment whether it is accessibleby the vehicle or not. There are several robust methodssegmenting images. It can for example be done using meanshift [18], watershed algorithms [19], or plain k-means [20].One advantage segmenting is that differentiation between roadand non-road parts can be done by shape and size in additionto color [3].

In the case of very clear lane markings image segmentationcan be done by using edges as segment borders. Edge detectioncan be performed by using Canny edge detection [11]. To re-move gaps, the morphological operation closing is performedon the binary image until all lane markers are completelyfilled. Road segments are now divided from non-road partsof the image by using detected lane markers as borders. Forlater calculations, all lane markers are being skeletonized bymorphological thinning operation and dashed lines are beingextended to closed lines. The resulting image contains all lanemarkers with a width of one pixel.

1) Model segments: From this point on, only binarizedimages depicting lane markers are being used.

a) Line Segments and Detection: To extract straight linesegments from the lane markings, Hough transform [21] is

CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 4

(a) Image of scene (b) Edges detected within scene (c) Background subtraction

Fig. 2. Left: Stitched images of the scene as starting point (simplified), middle: detected edges within scene, right: image divided into foreground andbackground, the road network being in the foreground.

applied. For each pixel within the binary image, each possibleline is transformed into the feature space by

r(φ) = x0cos(φ) + y0sin(φ) (3)

where p(x0, y0) represents a pixel in the image space andpp(r, φ) is a point in the feature space, represented by polarcoordinates. The slope φ can either be calculated by takingthe gradient of the edge at the point or by inserting multiplepoints with different slopes φi into the feature space. Linesin the image are determined by voting in feature space. Thatmeans clusters in the feature space represent lines in the imagespace, which are transformed back by

y = −cosφ

sinφx+

r

sinφ(4)

into image coordinates. Once all lines are calculated, segmentsof the lane markings need to be assigned to those lines. Thisis done by assigning pixels that area) closer to a line than a given threshold andb) not the only points within a small region on that line.The second condition ensures that points on intersecting arcsare not assigned to an incorrect line.

b) Circle Segment Classification: The parameters of acircle, namely radius r and center c, can be determined givenany three points on the circle. If given noisy points, multipleri and ci are calculated, so that the best fitting circle canbe selected. Choosing the best parameters can be done withRANSAC [15] using the previously calculated ri, ci as seeds.

c) Euler Spirals: Euler spirals, commonly also known asclothoids or Cornu spirals, are used for jerk free transitions be-tween multiple straight line segments, multiple circle-segmentsor to connect a circle segment with a straight line. Clothoidsare defined as(

xy

)(l) = A

√π

∫ l

0

(cos(πt2/2)sin(πt2/2)

)dt (5)

with their curvature being

κ(l) =

√π

Al (6)

where A is a factor that defines the rate of change of theclothoid’s curvature. A transitioning spiral between a line and

a circle segment can be calculated by determining the anglebetween the line’s direction and the tangent of the circlesegment at the transition point θc and the curvature of thecircle κc. The curvature of a circle is given by

κc =1

R(7)

R being the radius of the circle. We receive a scale transi-tioning curve by computing the Fresnel integrals (A = 1) asdescribed by [22].(

C(l)S(l)

)=

∫ l

0

(cos2

sin2

)ds (8)

up to the point where curvature κc and clothoid angle θequal and so do the circles curvature κc and tangent in thetransitioning point θc respectively. A is then used to stretchthe clothoid to fit. In the end, the clothoid has to be translatedalong the line until its end point is identical to the transitionpoint on the circle segment.

d) Calculating Center of Lanes: Lane centers are calcu-lated by averaging line and circle parameters of close parallellane markers. If the resulting segment lays within a roadsegment, it is also in the middle of a lane. Clothoid parameterscannot be averaged due to possibly varying directions of leftand right lane markers. Transition curves are then calculatedfor the lane markers. The lane centers within intersections haveto be considered individually, because of the possible turnsonto different lanes and therefore offer additional complexity.

2) Graph Based Road Network: The road networks arebeing modeled based on graphs. Edges represent road seg-ments, while nodes act as connectors between different roadsegments. A node n may connect two or more road-segments.In the case of n > 2 connected road-segments, n−1 paths arepossible. Multiple line segments are to be connected by eithertwo clothoids or a clothoid-circle-clothoid segment. Betweeneach line and circle segment, there needs to be a clothoidfor smooth transitioning. Two circle segments have to beconnected by two clothoids that act as jerk-free transitioncurves. For each road segment, at least position, length and thenumber of lanes are stored. The minimum information storedfor each connector consists of location and possible paths.

CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 5

(a) Tracked Vehicle

Fig. 3. Vehicle as it is being tracked, displayed with an axis aligned bounding box and an area minimizing rectangle. With the parameters of the areaminimizing rectangle one can calculated the position and orientation in a very straight forward manner.

D. Vehicle tracking

There are several options for tracking the vehicles withinthe scene. Vehicle tracking can be performed on each cameraframe individually without prior image stitching. That can bedone, because the homography for the static transformationsbetween camera images can be stored, after its initial calcu-lation. The transformations only need to be applied after adynamic object has been detected and only to its locationand orientation. Depending on the scene, different trackingalgorithms are favorable. It is for example possible to usecolor tracking, template tracking or model the background andregister changes within the scene [23].

1) Color Tracker: If applicable, the color tracker mostlikely offers the fastest implementation, while also offeringrobust object tracking. Color tracking can only be used if thedynamic objects differ in color from the rest of the scene.It can be implemented by storing a color model within apreferred color space. Lower and upper thresholds for colorcomponents are stored and used for background segmentation.For example using a HSV color space, there needs to bea lower and upper threshold for each hue, saturation andvalue. Those thresholds are chosen such that most pixelswithin the object lay within the given range and while mostother pixels are not. Performing morphological operations,such as open-close on the result removes noise and fills inmissing pixels of the object. Given such color model, theimage can be segmented into fore- and background, suchthat the foreground consists solely on the vehicle. Giventhis segmentation, a minimum-area bounding rectangle canbe calculated in addition the center of the object. From theminimum-area bounding rectangle, the position, orientationand magnitude of the vehicle is calculated.

2) Tracking by Background Subtraction: Another optionfor tracking objects within the scene constitutes moving staticcontent into the background, which leaves dynamic objects inthe foreground. Using adaptive background mixture modelsas proposed by Stauffer [23] noise and recurring changes canbe modeled as background, so that all foreground objects aredynamic objects, which can be tracked.

3) Feature Tracking: By creating visual patterns on theobject or having distinctly colored vehicles, they can bedetected by keypoint matching. There need to be at least

two distinct features on each object for detecting location andorientation of dynamic objects. Additional keypoints benefit inmoments of occlusion. For feature detection and recognition,for example the Speeded Up Robust Features (SURF) or Scale-invariant feature transform (SIFT) local feature detectors canbe used.

4) Speeding Up Tracking by Using Predictions: Once adynamic object is labeled, its tracking procedure can be spedup by predicting its future location. Based on current posi-tion, orientation and velocity, future positions and a maximaldeviation from its current position are estimated. In this casetracking algorithms only have to check a certain area aroundthe estimates. High frame rates with fast tracking algorithmslead to a high frequency tracking. The higher the trackingfrequency, the less an object moves between tracking steps,therefore shrinking the area to be checked.

IV. EVALUATION

The framework is currently partially implemented in C++with OpenCV (Open Source Computer Vision) on a 4thgeneration i7 Quad Core running Windows 7.

The camera network consists of two cameras, each facingthe scene vertically, covering it completely. Image Stitchingis implemented using feature points with blob detectors andSURF [12] descriptors. The problem of locally similar key-points is solved by occluding a majority of those feature pointsthat tended to be matched incorrectly. All matched keypointsare then used holistically to calculate a homography with aRANSAC [15] approach. The homography is calculated onlyonce unless there are changes in the camera positioning.

The roadmap analysis works for the scene depicted in 2aand for similar road maps. It currently supports 2 lane environ-ments with same lane width on both sides. Lane markers in themiddle of the road are not being considered yet. Intersections,turns and lane centers have yet to be implemented.

Vehicle tracking is implemented based on a color trackerin HSV space at 30 frames per second. The tracker worksreliably with a maximum error of approximately 5mm. For thevehicle control this needs to be filtered e.g. by using a Kalmanfilter [24]. Accurate measurements will be provided once thesystem is running properly. Lane centers, vehicle control andthe vehicle itself are not yet running.

CES SEMINAR, MATHCCES AND IKA, RWTH AACHEN, SS2014 6

V. CONCLUSION AND OUTLOOK

In this paper, a way to implement a road detection andanalysis solely based on a video camera network monitoringthe environment was presented. It consists of four main parts,which are camera network installation and image stitching,road network analysis, vehicle tracking and vehicle control.Multiple parts can be implemented using various approaches,each favorable in different situations. After an initial calibra-tion of the camera network and initialization of the scene, thesystem can be run in real-time and does not need to recalculateits road-map until there are changes in the environment orcamera placements. This seminar paper serves as a startingpoint for implementing a system of similar functionality.

There are many applications where this framework can beused for testing, learning and optimization. The first and mostimportant objective at the time is to complete basic function-ality, after which the framework can be extended. There aremany possibilities to extend the presented framework, it ispossible to add multiple vehicle support, parking space func-tionality, lane changes, intersection support, multiple lanes,traffic signs and lots more.

REFERENCES

[1] M. S. Andre Ibisch, Sebastian Houben, “Towards highly automated driv-ing in a parking garage: General object localization and tracking usingan environment-embedded camera system,” IEEE Intelligent VehiclesSymposium (IV), Jun. 2014.

[2] A. Ibisch, S. Stumper, H. Altinger, M. Neuhausen, M. Tschentscher,M. Schlipsing, J. Salinen, and A. Knoll, “Towards autonomous drivingin a parking garage: Vehicle localization and tracking using environment-embedded lidar sensors,” in Intelligent Vehicles Symposium (IV), 2013IEEE. IEEE, 2013, pp. 829–834.

[3] W. Christmas, J. Kittler, and M. Petrou, “Structural matching incomputer vision using probabilistic relaxation,” Pattern Analysis andMachine Intelligence, IEEE Transactions on, vol. 17, no. 8, pp. 749–764, Aug 1995.

[4] M. Barzohar and D. Cooper, “Automatic finding of main roads in aerialimages by using geometric-stochastic models and estimation,” PatternAnalysis and Machine Intelligence, IEEE Transactions on, vol. 18, no. 7,pp. 707–721, Jul 1996.

[5] V. Tsai, “A comparative study on shadow compensation of color aerialimages in invariant color models,” Geoscience and Remote Sensing,IEEE Transactions on, vol. 44, no. 6, pp. 1661–1671, June 2006.

[6] M. W. M. G. Dissanayake, P. Newman, S. Clark, H. Durrant-Whyte,and M. Csorba, “A solution to the simultaneous localization and mapbuilding (slam) problem,” Robotics and Automation, IEEE Transactionson, vol. 17, no. 3, pp. 229–241, Jun 2001.

[7] A. Elfes, “Sonar-based real-world mapping and navigation,” Roboticsand Automation, IEEE Journal of, vol. 3, no. 3, pp. 249–265, June1987.

[8] G. DeSouza and A. Kak, “Vision for mobile robot navigation: asurvey,” Pattern Analysis and Machine Intelligence, IEEE Transactionson, vol. 24, no. 2, pp. 237–267, Feb 2002.

[9] D. Hall and J. Llinas, “An introduction to multisensor data fusion,”Proceedings of the IEEE, vol. 85, no. 1, pp. 6–23, Jan 1997.

[10] R. Szeliski, “Image alignment and stitching: A tutorial,” Found. Trends.Comput. Graph. Vis., vol. 2, no. 1, pp. 1–104, Jan. 2006. [Online].Available: http://dx.doi.org/10.1561/0600000009

[11] J. Canny, “A computational approach to edge detection,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 8, no. 6, pp. 679–698, Jun. 1986.[Online]. Available: http://dx.doi.org/10.1109/TPAMI.1986.4767851

[12] H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robustfeatures,” in Computer Vision–ECCV 2006. Springer, 2006, pp. 404–417.

[13] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”International journal of computer vision, vol. 60, no. 2, pp. 91–110,2004.

[14] T. Akenine-Moller, E. Haines, and N. Hoffman, Real-time rendering.AK, 2002.

[15] A. W. Fitzgibbon, “Simultaneous linear estimation of multiple viewgeometry and lens distortion,” 2001.

[16] W. Dahmen and A. Reusken, Numerik fur Ingenieure und Naturwis-senschaftler. Springer, 2008, vol. 2.

[17] R. Adams and L. Bischof, “Seeded region growing,” IEEE Trans.Pattern Anal. Mach. Intell., vol. 16, no. 6, pp. 641–647, Jun. 1994.[Online]. Available: http://dx.doi.org/10.1109/34.295913

[18] D. Comaniciu and P. Meer, “Mean shift: A robust approach towardfeature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell.,vol. 24, no. 5, pp. 603–619, May 2002. [Online]. Available:http://dx.doi.org/10.1109/34.1000236

[19] L. Vincent and P. Soille, “Watersheds in digital spaces: An efficientalgorithm based on immersion simulations,” IEEE Trans. Pattern Anal.Mach. Intell., vol. 13, no. 6, pp. 583–598, Jun. 1991. [Online].Available: http://dx.doi.org/10.1109/34.87344

[20] T. N. Pappas, “An adaptive clustering algorithm for imagesegmentation.” IEEE Transactions on Signal Processing, vol. 40,no. 4, pp. 901–914, 1992. [Online]. Available: http://dblp.uni-trier.de/db/journals/tsp/tsp40.html#Pappas92

[21] D. A. Forsyth and J. Ponce, Computer Vision: A Modern Approach.Prentice Hall Professional Technical Reference, 2002.

[22] B. B. Kimia, I. Frankel, and A.-M. Popescu, “Euler spiral for shapecompletion,” International journal of computer vision, vol. 54, no. 1-3,pp. 159–182, 2003.

[23] C. Stauffer and W. E. L. Grimson, “Adaptive background mixture modelsfor real-time tracking,” in Computer Vision and Pattern Recognition,1999. IEEE Computer Society Conference on., vol. 2. IEEE, 1999.

[24] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorialon particle filters for online nonlinear/non-gaussian bayesian tracking,”Trans. Sig. Proc., vol. 50, no. 2, pp. 174–188, Feb. 2002. [Online].Available: http://dx.doi.org/10.1109/78.978374