editor: alberto broggi vision in and out of vehiclesnema/publications/fletcher03.pdf · the...

Editor: Alberto BroggiUniversity of Parma, [email protected]

I n t e l l i g e n t T r a n s p o r t a t i o n S y s t e m s

12 1094-7167/03/$17.00 © 2003 IEEE IEEE INTELLIGENT SYSTEMS

Her baby daughter was delivered seven weeks early by emer-gency Caesarian section and is in critical condition. The girl’s10-month-old brother, who was in a child restraint in thebackseat, is also in critical condition. The accident happenedwhen a car, thought to have been traveling at up to 80 kmph(the speed limit), failed to stop. Days later, another fatalityoccurred—this time a truck driver. The truck left its lane,jumped the guard rail, and hit an embankment. Both acci-dents happened in fine driving conditions on the city’s edge.

Perhaps the drivers could have avoided these accidentsif they had received a warning about the impending situa-tion. Almost every driver has experienced a warning froma passenger, perhaps about an obscured car while mergingor a jaywalking pedestrian in a blind spot. Such warningscould save countless lives every day.

We believe that, in the near future, such assistance willcome from the vehicle itself. Many vehicles already employcomputer-based driver assistance in the form of antilockbraking systems or adaptive engine management systems.However, more than a decade after autonomous-systemtechnologies emerged, systems such as those that Univer-

sität der Bundeswehr Munich1 and Carnegie Mellon Uni-versity’s NavLab group2 developed have not been realizedcommercially. A key distinction between existing systemsand promising R&D systems is driver and vehicle manufac-turer acceptance. Because a paradigm shift to autonomousvehicles is unlikely, intelligent-vehicle technologies mustmake it to the road by subsystems instead. These subsys-tems should solve a small, well-defined task that supports,not replaces, the driver.

At the Australian National University’s Intelligent Vehi-cle Project, we are developing such subsystems for

• Driver fatigue or inattention detection• Pedestrian spotting• Blind-spot checking and merging assistance to validate

whether sufficient clearance exists between cars• Driver feedback for lane keeping• Computer-augmented vision (that is, lane boundary or

vehicle highlighting on a head-up display)• Traffic sign detection and recognition• Human factors research aids

Systems that perform such supporting tasks are gener-ally called driver assistance systems. Although we can’tknow the actual circumstances of the accidents we men-tioned earlier, we believe that implementing DAS couldprevent similar accidents or at least reduce their severity.

DAS goalsRobustness is of paramount importance for systems in

cars driven on public roads. Solutions to sensing and detec-tion problems must be reliable. Fortunately, roads are de-signed to be high contrast, predictable in layout, and gov-erned by simple rules. This makes the sensing problemsomewhat easier, although by no means trivial. Comple-mentary sensors and algorithms can reduce a catastrophicfailure’s likelihood, but robust systems must incorporateperformance metrics and graceful failure modes from thestart. Systems must be operable in all driving environments.

This means systems must be able to handle urban envi-ronments as well as highways. Urban environments have

W ith its wide roads and low congestion, Canberra,

Australia, has relatively few serious accidents.

However, two recently occurred. In the first, a pregnant

woman died in her car while waiting at an intersection.

Vision in and out of Vehicles

Luke Fletcher, Nicholas Apostoloff, Lars Petersson, and Alexander Zelinsky, Australian National University

Published by the IEEE Computer Society

Although road congestion and accidents in Australia are relatively smallproblems compared to other parts of the world, many laboratories thereare developing driver aids. This article describes research activities at theAustralian National University. There, a team is studying and testing auto-matic lane detction, obstacle detction, and driver monitoring on an exper-imental vehicle.

Feel free to contact me regarding this department. I also seek contri-butions on the status of ITS projects worldwide as well as ideas on andtrends in future transportation systems. Contact me at [email protected];ww.ce.unipr.it/broggi. —Alberto Broggi

Editor’s Perspective

proved difficult owing to an explosion in thediversity of road scenarios and the lowersignal-to-noise ratio of content in clutteredurban scenes. Human drivers rely muchmore extensively on predicting the behaviorof other road users and pedestrians in thesescenarios than in highway situations. Thesepowers of higher reasoning, which ofteninvolve making eye contact with other roadusers, are not easily modeled and will notcome easily to AI systems.

As with most computer vision problems,DAS systems that work 80 to 90 percent ofthe time are orders of magnitude simpler toimplement than systems that work 96 to 99percent of the time. System designers oftenface choosing a pedantic system that contin-ually reports false positives or an undercon-fident system that must continually admitthat it can’t help at the moment. The ANUIntelligent Vehicle Project addresses thisdilemma by observing the driver. Monitor-ing the driver—particularly where the driveris looking—can avoid many interruptions.That is, if the driver is looking at a potentialproblem or an uncertain area in the roadscene, a warning is irrelevant. This strategyis based on DAS’s higher goal to assist dri-vers by informing them of occurrences ofwhich they might not be aware, not second-guessing drivers’ choices when they arepaying attention. So, in a complex trafficscene, as long as the driver has noted anidentified hazard, such as an overtaking car,or an unpredictable hazard, such as wander-ing pedestrian, the system gives no alert.

Finally, driver assistance systems mustalso be

• Intuitive. Their behavior makes immedi-ate sense in the context of the standarddriving task.

• Nonintrusive. They don’t distract or disruptthe driver unless they deem it necessary.

• Overridable. The driver has ultimatecontrol and can refuse assistance.

The Transport ResearchExperimental Vehicle

The Intelligent Vehicle Project’s platformis the Transport Research ExperimentalVehicle, a Toyota Land Cruiser with a vari-ety of sensors and actuators that supportvarious ITS-related research. Vision is theprimary sense used on board TREV, whichincorporates two major systems (see Figure1). An ANU-developed CeDAR (CableDrive Active Vision Robot) stereo active

camera platform, which replaces the rear-view mirror, monitors the road scene in frontof the vehicle. This system lets the camerasrotate left and right independently on a sharedtilt axis. To monitor the driver, TREV usesa dashboard-mounted faceLAB head-and-eye-tracking system (www.seeingmachines.com/technology/faceLAB.htm).

TREV also includes the typical range ofvehicle-monitoring devices: Global Posi-tioning System technology, inertial-naviga-tion sensing, and speed and steering-anglesensors. Throttle and steering actuators sup-port lane-keeping and automatic-cruise-control-style experiments.

Robustness through intelligentuse of visual information

Despite many impressive past results invisual processing, no single visual-process-ing method can perform reliably in all traf-fic situations. Achieving stable vision-basedsubsystems will require executing multipleimage-processing methods and selectingthem on the basis of the prevailing condi-tions. We have developed Distillation, avisual-cue-processing framework that letsus deal with such robustness issues. Distil-lation combines a visual-cue-schedulingand data fusion system with a particle filter(see Figure 2). (Particle filtering is also

MAY/JUNE 2003 computer.org/intelligent 13

Resampleparticles

Diffuseparticles

Deterministicdrift

Update PDF

Calculatemetrics

Allocateresources

Calculatecues

Cue processor Particle filter

Particles

Update cues Fuse dataProbability

densityfunction

Figure 2. The Distillation visual-cue-processing framework. On the left, visual cues execute on the basis of merit and available computational resources; on the right, aparticle filter combines the results.

Figure 1. The Transport Research Experimental Vehicle uses two vision platforms: theCeDAR (Cable Drive Active Vision Robot) active-vision head and faceLAB passivestereo cameras.

CeDAR

faceLAB

known as the condensation algorithm orMonte Carlo sampling algorithm.)

Distillation lets us

• Combine visual cues on the basis ofBayesian theory

• Allocate computational resources over thesuite of visual cues on the basis of merit

• Employ top-down hypothesis testinginstead of reconstructive techniques thatare often poorly conditioned

• Integrate visual-cue performance metricsso that we can safely deal with deterio-rating performance

• Combine visual cues running at differentframe rates

We have also demonstrated Distillation forindoor people tracking.3

As part of the Intelligent Vehicle Project,we have developed two applications thatexploit the Distillation framework: one forlane tracking and one for obstacle detectionand tracking.

Lane trackingThis application combines visual cues

such as edges for finding lane marks androad color consistency with cues based onphysical-world constraints such as vanish-ing points and plausible road shapes. Fromthese cues, it distills a winning hypothesisof the vehicle’s position with respect to theroad and the geometry of the road ahead.

The Distillation framework reduces thelane tracker’s search space. Distillationconcurrently estimates the road width, thevehicle’s lateral offset from the road’s cen-

terline, and the vehicle’s yaw with respectto the centerline. The application estimatesthe horizontal and vertical road curvaturein the far field.

Figure 3 shows the lane tracker’s outputin several situations using four differentcues.

Obstacle detection and trackingThis application has three main levels.

The most primitive level uses a set of “bot-tom up” whole-image techniques to searchthe image space for likely obstacle candi-dates. This level primarily uses stereo dis-parity and optical flow. Although the dis-parity and flow information are very noisy,we can combine them to form a 3D depthflow field (see Figure 4). We can then usethis field to coarsely segment for potential

14 computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Figure 3. Lane tracker results. Yellow lines indicate the estimated lane boundary. Tracking is robust against shadows, obstacles, andmisleading lines on the road.

obstacles using clustering of 3D-flow vec-tors in the near to medium range.

We can also use color consistency toderive possible obstacle candidates.

The application injects sets of particlesrepresenting each obstacle candidate intothe particle filter state space inside the Dis-tillation framework. Distillation then tracksthe obstacles between frames.

Robustness through drivermonitoring

Driver monitoring can

• Reduce false alarm rates.• Directly detect driver fatigue through head

slumping and prolonged eye closure.4

• Detect driver inattention to the roadscene in general or particular regions.

• Direct DAS attention: “The driver islooking in a particular direction; why?”

If we augment driver monitoring withinformation about the vehicle and trafficstate, we can make additional inferencesabout the driver. For example, by correlat-ing automated lane tracking and drivergaze monitoring, we can eliminate consid-erable tediousness from human-factorsstyle experiments.

Seeing Machines developed the faceLABdriver-monitoring system in conjunctionwith ANU and Volvo Technology Corpora-tion. FaceLAB uses a passive pair of cam-eras to capture video images of the driver’shead. It then processes these images in realtime to determine the 3D pose of the per-son’s face (± 1 mm, ± 1 degree) as well asthe eye gaze direction (± 3 degrees), blinkrates, and eye closure. Figure 5 shows theexperimental setup.

The results in Figure 6 show a cleardirectional bias based on the prevailingroad curvature. That is, when driving on aroad that curves to the right, the driverfocuses more on objects on that side, andvice versa. Figure 7 shows a strong correla-tion between the vehicle yaw angle fromthe lane tracker and the gaze direction.That is, the driver is more likely to veer inthe direction is which he or she is gazing.

These preliminary results seem to indi-cate that some of the assumptions aboutdriver gaze that we encapsulated into oursubsystems seem justifiable. (For exam-ple, two common assumptions are thatdrivers tend to gaze at the inner edge ofcurves in the road ahead and that drivers

MAY/JUNE 2003 computer.org/intelligent 15

Figure 4. Obstacle detection and tracking: (a) the left image from a stereo pair; (b) a 3Dsurface from stereo disparity (the rectangle indicates the region of 3D depth flow); (c) 3D depth flow.

(a)

(b)

(c)

periodically glance [not stare] at the laneof oncoming traffic.) One such subsystemunder development is a lane-keeping sys-tem that applies a restorative force to thesteering wheel if the car approaches thelane boundaries. If the driver looks as if heor she is preparing to change lanes, thesubsystem reduces this corrective force inthe relevant direction until the lane changeis complete.

The Intelligent Vehicle Project is still inits early days. Vehicle and pedestrian detec-tion, blind-spot monitoring, and road-signdetection are all in the pipeline, as are newinterfacing systems and context-relevantdriver feedback systems such as force feed-back and auditory signals.

The need for such driver assistance sub-systems is real. Until each vehicle comes

16 computer.org/intelligent IEEE INTELLIGENT SYSTEMS

Luke Fletcher is aPhD candidate inengineering at theRobotic SystemsLaboratory. Con-tact him at the Ro-botic Systems Lab-oratory, Dept. ofSystems Eng., Re-

search School of Information Sciences andEng., Australian Nat’l Univ., Canberra,ACT 0200 Australia; [email protected].

NicholasApostoloff is aPhD candidate inengineering at theUniversity ofOxford’s VisualGeometry Group.He previously wasa master’s student

at the Australian National University re-searching vision-based lane tracking. Con-tact him at the Visual Geometry Group,Univ. of Oxford, Oxford, UK; [email protected].

Lars Petersson isa postdoctoral re-search fellow at theRobotic SystemsLaboratory. Con-tact him at the Ro-botic Systems Lab-oratory, Dept. ofSystems Eng., Re-

search School of Information Sciences andEng., Australian Nat’l Univ., Canberra,ACT 0200 Australia; [email protected].

AlexanderZelinsky is a pro-fessor of systemsengineering in theAustralian NationalUniversity’s Re-search School ofInformation Sci-ences and Engi-

neering, where he leads a research teamworking on intelligent vehicles, mobilerobotics, and human–machine interaction.He is also the CEO of Seeing Machines.Contact him at the Robotic Systems Labo-ratory, Dept. of Systems Eng., ResearchSchool of Information Sciences and Eng.,Australian Nat’l Univ., Canberra, ACT0200 Australia; [email protected].

Mainly left curvature

High-curvature road

Left LeftOther

Road

RightRight

OtherMainly right curvature

Straight road

Rightroad

RightroadRoad

Road

RightLeft

Other

Figure 6. Proportions of viewing direction on left and right curvature roads. When driving on a road that curves to the right, the driver focuses more on objects on thatside, and vice versa.

Figure 5. The integration of lane tracking with driver eye gaze tracking. The top leftshows the lane tracker’s output. The top right shows the video camera arrangement.The bottom shows segmentation of the field of view for analysis.

Eye gaze

Vertical field

Far field

Near field

Left of road Detected lane (in red)

Oncoming traffic lane (in black)

Driver

Right of road

129 m

29 m

9 m

with its own inbuilt set of vigilant passen-gers, may the number of lives lost be low.

References

1. E.D. Dickmanns and A. Zapp, “AutonomousHigh Speed Road Vehicle Guidance by Com-puter Vision,” Automatic Control—WorldCongress, 1987: Selected Papers from the10th Triennial World Congress of the Int’lFederation of Automatic Control, Pergamon,1987, pp. 221–226.

2. C. Thorpe, Vision and Navigation: TheCarnegie Mellon NavLab, Kluwer AcademicPublishers, 1990.

3. G. Loy et al., “An Adaptive Fusion Architec-ture for Target Tracking,” Proc. 5th Int’l Conf.Automatic Face and Gesture Recognition,IEEE CS Press, 2002, pp. 261–266.

4. N.L. Haworth, T.J. Triggs, and E.M. Grey,Driver Fatigue: Concepts, Measurement andCrash Countermeasures, Federal Office ofRoad Safety Contract Report 72, Human Fac-tors Group, Dept. of Psychology, MonashUniv., 1988.

Figure 7. Reconciliation between gaze direction and vehicle yaw angle. The driver ismore likely to veer in the direction in which he or she is gazing.

Yaw

(deg

rees

)

Driver gazeVehicle yawAverage vehicle yaw

20

15

10

5

0

–5

–10

–15

–2020

Time (sec.)40 60 80 100 120

Call for PapersIEEE Intelligent Systems Special Issue

Submission deadline: 1 Aug. 2003Submission address: [email protected] date: Jan. 2004

Mining the Web for Actionable Knowledge

Recently, there is much work on data mining on the Web to discover novel anduseful knowledge about the Web and its users. Much of this knowledge can beconsumed directly by computers rather than humans. Such actionable knowledgecan be applied back to the Web for measurable performance improvement.

For this special issue, we invite original, high-quality submissions that address allaspects of Web mining for actionable knowledge. Submissions must address theissues of what knowledge is discovered and how such knowledge is applied toimprove the performance of Web based systems. We are particularly interestedin papers that offer measurable gains in terms of well-defined performance crite-ria through Web data mining. Topics of interest include but are not limited to

. Web information extraction and wrapping

. Web resource discovery and topic distillation

. Web search

. Web services

. Web mining for searching, querying, and crawling

. Web content personalization

. Adaptive Web sites

. Adaptive Web caching and prefetching

Submissions should be 3,000 to 7,500 words (counting a standard figure or table as 250 words) and should follow the magazine’s style and presentation guidelines (see http://computer.org/intelligent/author.htm). References should be limited to 10 citations.

Guest Editors:. Craig Knoblock, University of Southern California. Xindong Wu, University of Vermont. Qiang Yang, Hong Kong University of Science and Technology

editor: alberto broggi vision in and out of vehiclesnema/publications/fletcher03.pdf · the...

Documents