detection and prediction of lane -chan ges : a study to ...€¦ · detection and prediction of...

Detection and prediction of lane-changes:

A study to infer driver intent using

support vector machine

STAFFAN BENGTSSON

Master of Science Thesis

Stockholm, Sweden 2012

Detection and prediction of lane-changes

A study to infer driver intent using support vector machine

Staffan Bengtsson

Master of Science Thesis MMK 2012:08 MDA 429

KTH Industrial Engineering and Management

Machine Design

SE-100 44 STOCKHOLM

i

Master of Science Thesis MMK 2012:08 MDA 429

Detection and prediction of lane-changes: A study to infer driver intent using support

vector machine

Staffan Bengtsson

Approved

2012-03-28

Examiner

Mats Hanson

Supervisor

Bengt Eriksson

Commissioner

Scania CV AB

Contact person

Joseph Ah-King

Abstract

The master thesis in this report has been made in cooperation between Scania CV AB in

Södertälje and the department of machine design at KTH in Stockholm. Scania CV AB is a

leading manufacturer of heavy trucks, buses and coaches, and industrial- and marine engines.

Increasing road safety demands for vehicles in development has resulted in a new generation of

safety systems, called Advanced Driver Assistance Systems (ADAS). These new systems do not

only try to mitigate the effects of a crash, they also try to prevent them from happening. To avoid

that ADAS will interfere during controlled maneuvers by a driver, it is necessary to determine

the intent and current actions of the driver from sensor measurements. This important

information will help to improve the decision-making for safety systems of when to engage in

assistive actions.

In this study, the possibilities to detect and predict lane-changes from patterns in sensor

measurements have been made using a truck. The main objectives were to decide how to

approach this type of problem without using turn signals, estimate the accuracies that can be

achieved and determine which sensors that are required to solve the task.

From the concept evaluation it was determined to use the pattern recognition technique support

vector machine (SVM) for the task. To train and test the algorithm real vehicle data was used,

recorded from a truck during motorway driving. From this data the algorithm was able to

correctly classify, with accuracy of 82%, between lane-keep and lane-change actions. The

algorithm could also quickly detect a lane-change before the first wheel crossed the lane

marking.

With the set-up of sensors available in the test vehicle, no significant patterns were found in

driver actions prior to a lane-change (the intent). Therefore, an alternative approach was

suggested that aimed to determine what the driver should prefer to do in a specific situation from

contextual data. This additional information helped to reduce the number of false positives with

8 percentage points when classifying between lane-change and lane-keep actions.

iii

Examensarbete MMK 2012:08 MDA 429

Detektion och prediktion av filbyten: Härledning av en förares avsikt med

support vector machine

Staffan Bengtsson

Godkänt

2012-03-28

Examinator

Mats Hanson

Handledare

Bengt Eriksson

Uppdragsgivare

Scania CV AB

Kontaktperson

Joseph Ah-King

Sammanfattning

Examensarbetet som behandlas i rapporten har utförts i samarbete med Scania CV AB i

Södertälje och institutionen för maskinkonstruktion på KTH i Stockholm. Scania CV AB är en

ledande tillverkare av tunga lastbilar, bussar samt industri- och marinmotorer.

Med allt tuffare krav på trafiksäkerhet för fordon under utveckling har en ny generationen av

säkerhetssystem växt fram, Advanced Driver Assistance Systems (ADAS). De här nya systemen

försöker inte bara att minska effekterna av en olycka utan även förhindra dess uppkomst. Det

ställer stora krav på att systemen enbart ingriper när det behövs och inte motarbetar föraren

under normal körning. Därför är det av intresse att undersöka hur en förares handlingar och

intentioner avspeglas i sensorvärden för att i slutändan kunna förbättra säkerhetssystemens

beslutsförmåga i olika situationer.

I projektet har en studie gjorts för att se om det finns tydliga mönster i sensorvärden utöver

blinkers, inför och under ett filbyte på motorväg med en lastbil. Syftet med studien var att

undersöka hur den här typen av problem kan lösas, estimera vilka noggrannheter som kan uppnås

samt ta reda på vilken uppsättning sensorer som behövs för att lösa problemet.

Utifrån den konceptevaluering som gjordes valdes metoden support vector machine (SVM) för

detta mönsterigenkänningsproblem. För att träna upp och testa algoritmen användes riktig

fordonsdata insamlad under motorvägskörning. Algoritmen kunde utifrån datat med 82%

noggrannhet korrekt detektera om föraren utförde en filhållnings- eller filbytesmanöver.

Algoritmen kunde också snabbt detektera en filbytesmanöver innan det första hjulet korsade

filmarkeringen.

Med tillgängliga sensorer i testfordonet kunde inga mönster hittas i förarrelaterade sensorvärden

inför ett filbyte som skulle avspegla dennes intentioner. Därför föreslogs en ny metod för att

istället avgöra vad föraren borde vilja göra givet en situation genom att titta på omvärldsdata.

Med hjälp av denna information kunde antalet felklassificeringar mellan filbytes- och

filhållningsmanövrar minskas med 8 procentenheter.

v

Nomenclature

Abbreviations

ABS Anti-locking Braking System

ACC Adaptive Cruise Control

ACT-R Adaptive Control of Thought - Rational

ADAS Advanced Driver Assistance System

AEB Automatic Emergency Brake

CAN Controller Area Network

ESP Electronic Stability Program

GPS Global Positioning System

HMM Hidden Markov Model

LDW Lane Departure Warning

ROC Receiver Operating Characteristics

RVM Relevance Vector Machine

SVM Support Vector Machine

Glossary

False negative When a positive event is classified as negative.

False positive When a negative event is classified as positive.

True negative When a negative event is classified as negative.

True positive When a positive event is classified as positive.

Action The process, or state, of acting or being active.

Intent Having the mind, attention, or will concentrated on something, orat some end or purpose. Intention and intent are synonyms.

Infer To derive something as a conclusion from facts or premises.

vii

Contents

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Theoretical framework 32.1 Defining driver intent and driver actions . . . . . . . . . . . . . . . . . . . . . . . 32.2 Introduction to pattern recognition . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2.1 Mathematical description of pattern recognition . . . . . . . . . . . . . . 52.2.2 Algorithms for pattern recognition . . . . . . . . . . . . . . . . . . . . . . 62.2.3 Related research for inferring driver intent and actions . . . . . . . . . . . 8

2.3 Introduction to SVM, RVM and HMM . . . . . . . . . . . . . . . . . . . . . . . . 82.3.1 Support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3.2 Relevance vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3.3 Hidden Markov models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Common scenarios on motorways . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.5 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5.1 Sensors used in related research . . . . . . . . . . . . . . . . . . . . . . . . 162.5.2 Available sensors in the test vehicle . . . . . . . . . . . . . . . . . . . . . 182.5.3 Conclusions from related research and available sensors . . . . . . . . . . 19

2.6 Choice of algorithm for the implementation . . . . . . . . . . . . . . . . . . . . . 20

3 Implementation 223.1 Aquiring sensor data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.1 Defining test and training data . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Analysis of sensor data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3 Enhancement of sensor data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3.1 Current lane position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.3.2 Future lane position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3.3 Difference from desired speed . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.4 System design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.4.1 Elements of the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.5 Sensor sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.5.1 Sensor sets for detecting lateral actions . . . . . . . . . . . . . . . . . . . 293.5.2 Sensor sets for predicting lateral intent . . . . . . . . . . . . . . . . . . . . 293.5.3 Sensor sets for inferred environmental intent . . . . . . . . . . . . . . . . 30

3.6 Classifier design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.6.1 Introduction to ROC graphs . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Results 334.1 Classification of lateral actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.1.1 Time until correct classification . . . . . . . . . . . . . . . . . . . . . . . . 354.1.2 Approach to reduce the false positive rate . . . . . . . . . . . . . . . . . . 374.1.3 Impact of including turn signals for detecting lateral actions . . . . . . . . 38

4.2 Classification of lateral intents . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.3 Classification of inferred environmental intents . . . . . . . . . . . . . . . . . . . 40

ix

5 Discussion 435.1 Collection of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.2 Classification algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.3 Sensors in the test vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6 Conclusions 46

7 Future work 47

References 48

A Support vector machine 51A.1 Linear support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51A.2 Quadratic programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A.2.1 Primal and dual problems in quadratic programming . . . . . . . . . . . . 51A.2.2 Dual problem for SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52A.2.3 Soft margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

A.3 The kernel trick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54A.3.1 Selecting kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

B Test vehicle 56

C Flowchart for inferred environmental intent 57

D Matlab scripts 58D.1 Main script for gridsearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58D.2 Main script for ROC plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

E Plots: Classification of lateral actions 60

F Plots: Classification of inferred environmental intent 62

x

1 Introduction

This report is the result of a master thesis work at Scania CV AB in Sodertalje and the de-partment of machine design at KTH (the Royal Institute of Technology) in Stockholm. Thefollowing introduction will provide the background, objectives, delimitations and method forthis thesis.

1.1 Background

In heavy vehicles there are many safety systems that aims to mitigate the effects of an accidentand to support the driver in performing a vast array of tasks. These are well-known systemssuch as airbags, Anti-lock Braking Systems (ABS) and Electronic Stability Program (ESP).

Increasing road safety demands for vehicles in development has resulted in a new generationof safety systems, called Advanced Driver Assistance Systems (ADAS). These new systems aimsto prevent an accident by either alerting the driver or actively taking control of the vehicle,making ADAS act more like a co-driver system. Typical systems are; Blind Spot Detection(BSD) that warns the driver when a vehicle is in the blind spot of the side-view mirror, andAutomatic Emergency Brake (AEB) that ensures emergency braking in critical situations.

With the development in this area there is one main problem to solve. How can these ADASaccurately make a decision of when to alert the driver or actively take control of the vehicle?Since ADAS should only activate when needed, not interfere when the driver correctly operatesthe vehicle, nor unnecessarily warn for irrelevant objects or situations. Misbehavior that maylead to annoyance and discomfort for the driver. For example, imagine that the driver intendsto change lane. In this scenario it would be annoying if a Lane Departure Warning (LDW)system frequently misinterprets the situation and sends out a warning for what it beliefs is anunintended lane-change. Another example could be if a BSD-system unnecessarily sends outwarnings for vehicles in the blind spot when the driver has no intents of changing lanes.

This creates the need of a system that with high accuracy can determine the intents andactions of drivers in heavy vehicles from sensor measurements. Information that will improvethe decision-making of ADAS of when to engage in assistive actions, but also making them moreadaptive when adjusting their outputs based on both the scenario and intents and actions ofthe driver.

1.2 Objectives

In heavy vehicles from Scania there are several sensors that either directly or indirectly measuredriver behavior, the intents and actions. The purpose of this thesis is to investigate how theseintents and actions manifest in sensor data, during and prior to a lane-change on motorways.Also, how they can be detected using a pattern recognition algorithm. This vital informationabout the driver would aid cooperating ADAS in their decision-making of when to engage inassistive actions regarding lane-changes.

Practically, the main objectives are therefore to determine the necessary sensors to inferdriver actions and intents (prior and during lane-changes), and to find a suitable mathematicalapproach to solve this problem. To test the approach and estimate accuracies that can beachieved will sensor data from a Scania truck be used. At the end of the project present theselected method, required sensors and the estimated accuracies in a report.

1

1.3 Delimitations

For this thesis, only common scenarios (the reason behind a driver’s intent and actions tochange lane) that occur on limited-access motorways including entrance and exit ramps will behandled. The drivers are assumed to not be distracted nor drowsy when operating the vehicle.They are also assumed to follow existing norms, laws and regulations, e.g. reverse driving isexcluded since it is not allowed on motorways. The only exception is the usage of turn signals,since a research by Olsen et al. showed that turn signals were only used in 66% of all recordednaturalistic lane-changes [1]. The testing will also only use sensor data from commerciallyavailable systems in Scania trucks today. If necessary, additional sensors will be recommendedfor future work.

1.4 Method

The workflow for this thesis included six elements; literature study, concept evaluation, conceptselection, collecting data, testing and analysis.

The first task was to conduct a literature study of related research to determine the differentapproaches available for this type of problem. The information gained formed the theoreticalframework and how the work should progress. A concept evaluation was then made to determinethe advantages and disadvantages for the available approaches when applied to the problem. Incombination with this were common scenarios on motorways and required sensors researched.The concept selection then resulted in selecting the approach that was best adapted to theproblem, sensors and scenarios.

To obtain vehicle and driver data for the development of the selected algorithm, sensormeasurements were recorded in a Scania truck. Field trips were organized for this purposeusing five different drivers. The data was recorded using a laptop with the software CANalyzer1

connected to the built-in CAN-network in the truck.With the selected approach and recorded data, a test-system was built using Matlab2.

Toolboxes combined with functions written by the author were used to construct and tunethe pattern recognition algorithm. This was followed by testing and analysis to determine theaccuracy of the solution by running off-line tests with collected data.

1CANalyzer version 7.2.38, Vector Informatik GmbH2Matlab version 7.12.0.635, the MathWorks, Inc.

2

2 Theoretical framework

This chapter further describes the problem followed by the process of concept evaluation andconcept selection to find a suitable algorithm for the implementation.

2.1 Defining driver intent and driver actions

The theory of reasoned action by Fishbein et al. suggests that a person’s actions (behavior)is determined by his or hers intent to perform that action [2]. In this theory, intent is viewedas the cognitive representation of the action and determined by three elements, the attitudetoward the specific behavior, subjective norms and perceived behavioral control, see Figure 1.

Figure 1: The theory of reasoned action.

Since it is not possible to directly determine the mental processes described by Fishbeinet al., it must instead be inferred at small timescales from actions resulting in changed sensorvalues. The goal is to find specific patterns in these measurements representing how the driverassess and prepares for a maneuver (the intent) and how the driver operates the vehicle toobtain this intent (the action). This is a continuous process that frequently updates given newsituations and additional information. For example, the intent and action to change lane maybe succeeded with a new intent to change lane again given the new circumstances. Often is theintent also changing multiple times before an resulting action due to changing scenarios.

Vehicle safety systems that incorporates information about the driver in its decision-makingoften only looks at driver actions, sometimes only values from a single sensor, trying to deter-mine when the system should suppress warnings. These are systems that aims to achieve highaccuracies in their decision making but often have problems with high rates of false positives(situations when the systems warns when it should not). High rates that sometimes leads tothe driver disabling the safety system out of annoyance.

By using an approach that both tries to detect the actual action but also predict it usingmultiple sensors, should be able to outperform existing solutions, both regarding accuracy andspeed. An approach that also would be more robust when adopting a more holistic view usingsensor information from the driver, vehicle and environment.

When using multiple sensor inputs are complicated dependencies introduced that are difficultto determine for expert configured rule-bases. Therefore it is desired to find another approachto solve these types of problems. There are techniques for this within the area of supervisedlearning in pattern recognition. These approaches can use samples of sensor data to trainan algorithm to automatically create these rules. It is around these type of approaches thetheoretical framework and thesis will discuss.

3

2.2 Introduction to pattern recognition

Pattern recognition is a branch within the area of artificial intelligence. It is used for inductiveinference where predictions are based on observations. Or as Bezdek wrote, ”pattern recognitionis a search for structure in data” [3]. The importance of pattern recognition have increasedover the last few decades. Today is extraction of information from large databases required inmost areas, it can be used for stock market analysis, machine vision, character recognition andlabeling of driver maneuvers in vehicles.

Pattern recognition can be divided into different categories. There are syntactic and numeri-cal pattern recognition, the first one uses symbolic features allowing more complex relationshipsbetween features while the second one use standard numerical feature vectors of fixed dimension.Another subcategory involves generative and discriminative models. Generative models requirean estimation of the prior probability P (a) and the conditional probability P (b|a), comparedto discriminative models that only require an estimation of the conditional probability P (a|b),see Figure 2.

Figure 2: Generative versus discriminative models.

For numerical pattern recognition there are four important elements as described by Bezdek [4],

• Process description: the overall description of which model to use, features to measureand sensors to gather this data.

• Feature analysis: the part of extraction, preprocessing and selection of data to use.

• Classifier design: how to classify, estimate, predict and control.

• Classification analysis: tendencies, validation and labeling of data.

These elements can be schematically described with Figure 3. The process description includesselecting how to model reality, which features to measure and which sensors to use for thesemeasurements. The sensor measurements, the inputs, are then analyzed and preprocessed inthe feature analysis block. This block assures that all adequate features are selected. Thispart is crucial because it helps to lower the amount of data to be processed in later stages.Analysis of the feature vectors also assure that the correct mathematical approach is chosenfor the classifier design. Since there are no general optimal approaches for a set of data shouldeach problem be treated uniquely. A feature analysis of Figure 4 could conclude that; due tothe crisp separation of data could a non-probabilistic method be used for Figure 4a, linear andshell-clustering for Figure 4b and a fuzzy approach due to the fuzzy relation between data forFigure 4c [5].

The classifier design aims to tune the parameters of the algorithm for optimal classification.Finally, a classification analysis is performed with validation of given results. For successfulpattern recognition are these four parts revisited until the system satisfies specified requirements.

4

Figure 3: Schematic view of the pattern recognition process. [6]

(a) Pattern type 1. (b) Pattern type 2 (c) Pattern type 3

Figure 4: Appearance for three different types of data sets.

The classification process can be designed by either using data with known class belongingsor by learning the classes based on similarities in the data. Common for both methods is thatit is first when the classifier is fully designed and tuned that it can be used for classifying newsets of data. These two methods are called supervised- and unsupervised learning. Supervisedlearning is the method of using data whose targeting classes are already known, a methodreferred to as classification in literature. Unsupervised learning, referred to as clustering, isused when inputs are of unknown classes and also when the number of classes is unknown.

Pattern recognition aims to find tendencies and structures in a set of data. In comparison,there is also pattern matching that aims to find strict patterns of reference in a given set ofdata. For example find the string A in text document B. Due to the stochastic variation inreal-world situations and in sensor measurements strict pattern matching is not applicable forthe problem presented in this thesis.

2.2.1 Mathematical description of pattern recognition

Pattern recognition aims to separate a given set of data X consisting of n feature vectors,x1, ..., xn, of real p-dimensional space Rp into a number of classes or clusters. In classicalanalysis these form a partition of X, data with a strong degree of membership in one subsetwhile weak against data in other subsets [7]. A set of data X is classified into m classes (orclusters) C1, ..., Cm such that the following conditions are fulfilled:

5

Ci 6= ∅, i = 1, ...,m (1)

∪mi=1Ci = X (2)

Ci ∩ Cj = ∅, i 6= j and i, j = 1, ...,m (3)

Pattern recognition assert that similar feature vectors are contained in the same class Ci

while other much less similar feature vectors are contained in others. There are different meth-ods for computing this dissimilarity or similarity (also known as proximity measures) betweenclasses. The definition of similarity is equivalent to one determining dissimilarity between twofeature vectors [8]. With data points placed in topological space, the nearness between twoobjects is given by the dissimilarity, d(x, y) with x, y ∈ X. A common dissimilarity (or metric)is the Euclidean distance,

d(x, y) =

√√√√ p∑j=1

(xj − yj)2 =‖ x− y ‖ (4)

The Euclidean distance is popular but the dissimilarity in Equation 5 is also common.

d(x, y) =

p∑j=1

(xj − yj)2 (5)

2.2.2 Algorithms for pattern recognition

There are several methods for pattern recognition. Since there is no optimal generic method,should the method be chosen depending on the task. Common techniques are described belowand summarized in Table 1 at the end of this subsection.

Classification techniques are characterized by its usage of allocating known labels (or targets)to data points, so called supervised learning. From training samples a decision rule can beconstructed that accurately assigns labels for feature vectors in new sets of data, e.g. SupportVector Machine (SVM) and perceptrons. A visualization of a typical linear decision rule betweentwo separated sets of data can be seen in Figure 5. Other common classification approaches aredecision trees and Relevance Vector Machine (RVM).

Figure 5: Visualization of a typical linear decision rule for classification.

Clustering techniques are used for sets of data with unknown labels and unknown number oflabels, so called unsupervised learning. For example, in Figure 6 a set of data is separated intothree clusters since the data is concentrated in these three areas. Some clustering techniques

6

search for local extremes using optimization to reduce a cost function. The local extremesthen define optimal clusterings [8]. Hierarchical methods creates a hierarchy of clusters thatare either merged (agglomerative) or divided (divisive) at each iteration. Another method isgenetic clustering that involves creating initial populations of clusters that are allowed to evolveand improve.

Figure 6: Visualization of how clusters are assigned to data.

Regression algorithms are used when real-valued outputs are required compared to thediscrete outputs for classification algorithms. They are mainly used for prediction (forecasting)with a predictive model being applied to observed data, see gray line in Figure 7. For additionalvalues of x1 can predictive x2 values be determined using this model. Regression is also usedto find the strength of relationship between dependent and independent data. There are linearregression approaches, see Figure 7, and others based on for example SVM and perceptrons.

Figure 7: Visualization of linear regression for a set of data.

Sequenced labeling is often used in combination with classification or regression. This ap-proach assigns a categorical label or real-valued output to each member of a sequence of ob-served values. One example is speech recognition when the labeling aims to assign a label toeach word in a sentence. Accuracy is achieved by making labels dependent on nearby labels inthe sequence, finding the best set of labels. These methods are mostly based on probabilisticapproaches, such as Markov models. Similar sequential techniques that aims to find real valuesare Kalman and particle filters.

7

Table 1: Common algorithms for pattern recognition.

Classification Clustering Regression Sequence labeling

Support vector machineDecision treesPerceptronsRelevance vector machine

Fuzzy clusteringHierarchical clusteringAgglomerative clusteringGenetic clustering

Linear regressionPerceptronsSupport vector machine

Hidden Markov modelsKalman filtersParticle filters

2.2.3 Related research for inferring driver intent and actions

Early research within this area focused on motion-based approaches with kinematic or dynamicmodels to describe a vehicle and predict its state evolution (speed, position, heading). Asresearch continued the importance of contextual information gained notice resulting in thebehavior based techniques most commonly used today. Behavior based techniques relies on thefact that drivers and vehicles exhibit typical patterns in specific maneuvers. To obtain thesepatterns are more sensors used to gather data from the driver, vehicle and environment yieldinga more holistic solution. [9] [10] [11]

One of the first approaches by Liu and Pentland used Hidden Markov Models (HMM) toclassify patterns in driver actions. This method modeled human behavior with a set of dynamicmodels sequenced together with a Markov network of probabilistic transitions [12] [13]. The firstversions developed did not have continuous recognition as would be required in a real vehicle,but they were able to distinguish 7 maneuvers using coupled HHM [11] [12] [13]. Work byKuge et al. extended this work with continuous recognition [14]. Other research using HHMfor continuous recognition was made by Berndt et al. for prediction of driver maneuvers [15].

In recent work, the binary classification approaches SVM and RVM has gained popularityover the previously more used HMM. Mandalia et al. used SVM to create a continuous classifierfor lane-keep and lane-change actions [16] [17]. Mabuchi et al. used SVM to classify if a driverdecides to stop at yellow light in traffic intersections [18]. Research by Morris et al. used adiscriminative real-time RVM classifier to detect the intent to change lanes [19]. McCall et al.also used RVM to predict driver intents to change lanes using real-world data [20].

Salvucci used a ”mind-tracking architecture” for simulating driver intents using a cognitivemodel of driver behavior implemented in the architecture for Adaptive Control of Thought -Rational (ACT-R). This system compared simulations with off-line recordings of driver’s actualbehavior to infer unobservable intents [9]. An approach by Lefevre et al. used a occupancy gridand Bayesian occupancy filter for estimating and predicting vehicles position [10]. Henning etal. used a ”knowledge discovery framework” to study behavioral and environmental indicatorsto predict lane-changes [21]. A driver behavior and aware brake assistance system by Trivediet al. used a Bayesian classification framework [22].

2.3 Introduction to SVM, RVM and HMM

When combining the presented problem in this thesis with common methods used for this typeof problem in related research, it was determined to continue with the study of SVM, RVM andHMM for the concept evaluation. These methods have proved to work well for finding patternsin sensor data. Some of the results obtained in related research are presented in Section 2.6.

Each of these three techniques will be briefly described in the following sections. In com-bination with information about available sensors and common scenarios on motorways theiradvantages and disadvantages are evaluated in the discussion for choice of algorithm for theimplementation in Section 2.6.

8

2.3.1 Support vector machine

Support vector machine is a method using supervised learning for analyzing and recognizingpatterns in data both for classification and regression. SVM emerged in mid 1990’s from thearea of statistical learning theory developed by Vapnik in the late 1970’s. It can be viewedas a new way of training multilayer perceptrons (used in neural nets) by solving a quadraticproblem [23]. A quadratic problem is an optimization problem of minimizing or maximizing afunction subject to constraints on its variables. SVM are today widely used in many areas, forhandwritten digit recognition [24], object recognition [25] and many others.

SVM finds a hyperplane or set of hyperplanes of Rp−1 dimension in a Rp-dimensional spacethat separates two classes of interest. These hyperplanes may also be called decision surfaces,optimal separating hyperplanes or optimal margin hyperplanes [23]. A good classification isachieved when the functional margin to the nearest data points is large. This is because alarger margin lowers the generalization error of the classifier, see Figure 8.

Figure 8: A hyperplane separating two sets of data (dots and boxes).

Many problems may not be linearly separable in defined finite dimensional space. Thereforeis a kernel function, K(x, xi), used to transform original feature space to a higher dimensionalspace to find a separable linear hyperplane there. This yields the general solution,

f(x) =k∑

i=1

ciK(x, xi) (6)

with xi being the inputs (feature vectors), K the kernel and ci a set of parameters to be decidedby the inputs [16].

Mathematically can a linear SVM be described using a set of training data (xi, yi) fori = 1, ..., n with yi ∈ {−1, 1} and xi ∈ Rp. With yi indicating the class the feature vector xibelongs to. In other words, yi shows on which side of the decision rule the data point shouldreside on. For a given set of data, there can be several hyperplanes that separates two classesof data, see Figure 9. The goal is to find a hyperplane that maximizes the margin betweenthese two classes. The margin is defined as the sum of distances from the closest data pointsof both classes to the hyperplane. A larger margin is desired because it reduces the problemof overfitting. Overfitting occurs when the solution is too customized for the training data andnot generalized for unknown test data.

9

Figure 9: Several hyperplanes separating two sets of data.

An arbitrary hyperplane that completely separates points of data into two classes is definedby the equation

f(x) ≡ w · x + b = 0 (7)

with w being the normal to the hyperplane, x a set of feature vectors and b the perpendiculardistance from the origin to the plane. Given an initial hyperplane with the target classesyi = {1,−1}, the parameters w and b can be adjusted to create the following constraints,

xi ·w + b ≥ +1 when yi = +1 (8)

xi ·w + b ≤ −1 when yi = −1 (9)

In Figure 8 these constraints represents the two parallel bounding hyperplanes (dotted lines).All points of data on these hyperplanes are the support vectors, hence the name of the method[26]. The perpendicular distance between these two bounding hyperplanes is 2

||w|| . Summarized,both Equation 8 and Equation 9 can be combined to one inequality,

yi(xi ·w + b)− 1 ≥ 0 ∀i (10)

Research within this area has investigated different approaches for multi-class problems.Common approaches are to solve multiple one-versus-one problems or one-versus-rest. Methodssuch as directed acyclic graphs can combine several binary classifiers to determine one resultingclass label. Other approaches combine many binary objective functions and then solves theresulting optimization problem. A comparative work found that for multi-class problems theone versus one method is optimal in terms of accuracy and computational cost [23].

2.3.2 Relevance vector machine

SVM has become a well-known method for pattern recognition and machine learning by ex-pressing predictions based on training data. However, SVM is limited because it makes pointpredictions rather than predictive distributions. RVM was formulated to address this problemby adding class membership probability to its predictions. [27]

RVM has been used for several different pattern recognition tasks, e.g. image analysis andfacial expression recognition. The technique is based on a Bayesian formulation (degrees ofplausibility given the information) using a prior (initial degree of belief) that results in a sparserepresentation. A representation that in a compact way describe features, similarly as the FastFourier Transform decomposes a signal into its building blocks.

10

The method is similar to SVM and can be viewed as a Bayesian extension of it, useful forboth regression and classification. Like SVM, is RVM used for binary classification with theaim to predict the posterior probability of membership for a given input feature vector x. Alabel may then be allocated to the class with which it has the greatest likelihood of belongingto. [28]

The training of a RVM classifier consists of two steps. The first step is to collect trainingdata with given targets (xi, yi) where y ∈ {−1, 1}, xi ∈ Rp for i = 1, ..., n. The second step is tofind optimal parameters for the classifier given the training data. The process is as follows, fora given input vector x, RVM models the probability distribution of its label y using a logisticfunction. An useful function since it maps inputs with values from negative infinity to positiveinfinity to between 0 and 1,

σ(fRVM (xi)) =1

1 + e−fRVM (xi)(11)

In Equation 11 the classifier function fRVM (x) is given by,

fRVM (x) =

N∑i=1

wiK(x, xi) (12)

with K(x, xi) being a kernel function and w are adjustable weights. Compared to SVM it isnot required for the kernel to satisfy Mercer’s condition. A condition that requires that thekernel is continuous, symmetric and have a positive definite gram matrix. The gram matrix isa matrix of all possible inner products a set of vectors. [27]

Based on the discrete Bernoulli distribution (only having two possible outcomes) the likeli-hood of label y given the adjustable weights w is expressed according to Equation 13.

P (y|w) =N∏i=1

σ(fRVM (xi))yi(1− σ(fRVM (xi)))

1−yi (13)

The marginal likelihood P (y|α), where α denotes the prior on weight w, can not be obtainedanalytically resulting in the need of a iterative process. With α∗i being the maximum a posteriori(MAP) estimate of the parameter αi it is possible to estimate the MAP for the weights, wMAP .This is obtained by maximizing the posterior distribution of the class labels given the inputvectors. This is equivalent to maximizing the objective function J in Equation 14.

J(w1, w2, ..., wN ) =n∑

i=1

logP (yi|wi) +n∑

i=1

logP (wi|α∗i ) (14)

In this equation the first summation term is corresponding to the likelihood of the class labelsand the second term to the prior on the parameters wi. In the resulting solution it is onlytraining data points with nonzero weights wi (called relevance vectors) that will contribute tothe decision function. [23] [27]

2.3.3 Hidden Markov models

A Hidden Markov Model is a statistical tool for modeling generative sequences developed in the1960s by Baum et al. It is the Russian mathematician Andrei Markov that gave the approachits name. The technique has been applied to several tasks including signal processing, speechrecognition and extraction of information from documents [29].

Figure 10 shows an example of a Markov process with three states, Urn 1, Urn 2 and Urn3 and three observations, red, green and blue. The model is a finite state machine with given

11

probabilistic transitions between states. Given a sequence of observations: red, red, green andblue it can be verified that the states Urn 2, Urn 2, Urn 1 and Urn 3 generated this sequence.The probability for this sequence would then be 0, 3 · 0, 3 · 0, 5 · 0, 2 = 0, 009.

Figure 10: Example of Markov process with three states and three outputs red, blue and green.

To extend this model to be a HMM the new model allows all observations to be emitted ateach state with a specific probability, see Figure 11. The main difference now is that it is notpossible to exactly know which sequence that will generate a specific sequence of observations.The state sequence is hidden hence the name of the method. What can be done is to calculatethe probability that the model will generate the sequence of observations and the most likelysequence of states to produce them. Therefore are the individual bias of internal states, transi-tion probabilities between states and set of initial probabilities the properties that characterizea HMM [30].

Figure 11: Example of hidden Markov model.

12

The compact mathematical notation for HMM is λ = (A,B, π) with the following elements:

A: The state transition probabilities of state j following state i.

B: The emission probability of observing vk when at state j.

πt: The probability of being in state q at time t=0 (initial state).

O: The observation sequence of fixed length T, o1, ..., oT , at time t.

Q: The state sequence of fixed length T, q1, ..., qT , at time t.

The HMM includes two assumptions, the first one called a Markov assumption describes a modelwhen the Markov property is assumed to hold. A Markov property, is a property of a statesignal that contains all relevant past information compactly [31]. One example is a golf ball inflight, all that matters is the current position and velocity for its future states (independentlyhow it was achieved), P (qt|qt−1). The second assumption states that the output (observation)at time t is dependent only on the current state, i.e. independent of previous observations,P (ot|qt).

When implementing a HMM method there are three main problems to solve.

1. Given the model λ = (A,B, π) and the observation sequence O = O1, ..., OT , how is theprobability of the the sequence given the model, P (O|λ), computed efficiently?

2. Given the model λ = (A,B, π) and the observation sequence O = O1, ..., OT , how is theoptimal state sequence Q = q1, .., qN computed to explain the observations?.

3. How should the parameters of λ = (A,B, π) be tuned so that P (O|λ) is maximized?

The first two problems can be solved analytically while the third problem is adjusted with modeltraining.

2.4 Common scenarios on motorways

The benefits of the limitation (see Section 1.3) to only look at scenarios occurring on limited-access motorways including entrance and exit ramps are many. It limits the number of possiblescenarios and they are fairly easy to recreate for own measurements for data collection. Anotheradvantage is that only motor vehicles with allowed speeds of 40 km/h or higher are allowedthereby excluding pedestrians, mopeds, bicycles and vehicles for agriculture. There are alsono intersections nor traffic lights which also decrease the number of scenarios. As mentionedearlier, in the scenarios described below are drivers assumed to follow existing laws and normsexcluding the usage of turn signals.

Limited-access motorways have opposing lanes of traffic separated by a median strip. Thismedian strip can be a traffic barrier, boulder or simply a strip of grass. In Europe there aremore or less the same rules for these motorways with speed limits being the main difference.Maximum speed limits are often varying between 100 km/h and 130 km/h with the exceptionGermany that in general have no speed limits. With the research and collection of data forthis thesis being made in Sweden are all results discussed in reference to right hand traffic.Since it also involves pattern recognition of driver intents and actions with heavy vehicles anda connected trailer, the maximum allowed speed for these vehicles is 80 km/h according toSwedish law [32].

The most common situations can be summarized and simplified to the 14 scenarios in Figure12. There exists variations of these, for example in Figure 12a the vehicle could reside in the

13

middle or outer lane but this contradicts existing laws and regulations. For illustration arethese scenarios represented using a motorway with four lanes but the scenarios are applicableto a motorway of arbitrary size.

(a) Scenario 1 (b) Scenario 2 (c) Scenario 3 (d) Scenario 4

(e) Scenario 5 (f) Scenario 6 (g) Scenario 7 (h) Scenario 8

(i) Scenario 9 (j) Scenario 10 (k) Scenario 11 (l) Scenario 12

(m) Scenario 13 (n) Scenario 14

Figure 12: Common scenarios occurring on a motorway. Explanation of each sub-figure aredescribed below.

14

Explanations for each subfigure in Figure 12:

1. Ideal situation, the ego-vehicle driving in the inner lane with desired speed (80 km/h) andno surrounding traffic.

2. Situation with a leading vehicle in-front of the ego-vehicle and no other surrounding traffic.Depending on the leading vehicle’s relative speed the ego-driver will intend to either stayin current lane at desired speed or change lane to obtain desired speed.

3. Obstacle in-front of ego-vehicle and no surrounding traffic. The driver intends to changelane to maintain desired speed.

4. A leading vehicle in-front of the ego-vehicle and surrounding traffic. Depending on therelative speed of the leading and surrounding vehicles, the driver will intend to eitherstay in current lane (at optimal speed or lowered speed) or change lane to obtain optimalspeed.

5. Obstacle in-front of ego-vehicle. Depending on amount of traffic, the driver will intend toeither change lane to avoid the obstacle and maintain desired speed, or decelerate untilfull stop in current lane.

6. Driver intends to change lane and enter motorway.

7. No surrounding traffic and the driver intends to change lane to aid vehicle on entrance-ramp.

8. Driver intends to stay in current lane due to surrounding traffic, cannot aid the vehicleon the entrance-ramp.

9. Driver intends to stay in current lane since there is no intent to exit the motorway.

10. Driver intends to change lane to enter exit-ramp and leave motorway.

11. Driver intends to change lane to an inner lane for slower vehicles. This could be due toeither additional lanes on the motorway, previous lane-change for a passing, prepare for aoncoming exit-ramp, or vehicles with a higher speed in the current lane.

12. Traffic congestion, driver intends to stay in current lane. Alternatively the lanes havedifferent end paths.

13. Entrance ramp becomes a new lane on the motorway. Driver aims to change lane.

14. Current lane turns into a exit lane, driver aims to change lane.

Conclusions from these common scenarios shows that it is important when classifying driverintents and actions to determine the current scenario. Because it is clearly the scenario thatdetermines any intent, successively determines any actions. These scenarios also verify thatduring motorway driving there are a limited amount of maneuvers to predict and detect. Thereare two lateral maneuvers (as handled by this thesis), lane-keep and lane-change, and twolongitudinal maneuvers, keep velocity or change velocity.

With the assumption that drivers will always try to obtain the maximum allowed speed, thescenarios in Figure 12 can compactly be described using a single flowchart, see Appendix C.This flowchart shows the required decisions the driver have to make, but also necessary sensordata for an algorithm to estimate which maneuver the driver most likely will do in the nearfuture given the scenario.

15

2.5 Sensors

To classify a driver’s lateral intents and actions there are sensors to quantify behavior andscenarios into digital values. These digital values corresponds to the feature values used forpattern recognition as explained earlier. It is important to only use sensors that measuresignificant changes and avoid those that do not show enough predictability or varies randomly.This requires that the sensor output change adequately to changes in the object measured andthat the sensor have limited effect on the actual measurements. For the purpose of obtainingvehicle and driver data there are in general no constraints. It could be from ego-vehicle sensors,external stationary sensors or from V2X communication (e.g. vehicle to vehicle).

Sensors used for this type of problem can be separated into three categories; driver behavior,vehicle dynamics and environmental data. Sensors for driver behavior display how the driverassess the current situation and prepares for oncoming lateral actions with head, eye and feetmovements. These inputs also show how the driver operates the vehicle by changing inputssuch as steering wheel and accelerator pedal. Sensors for vehicle dynamics gives informationabout accelerations, velocities and yaw rates for the vehicle. Sensors for environmental datagives information about surrounding traffic and position of ego-vehicle in relation to referencesystems (e.g. lane markings or position using GPS).

In the two following subsections are first sensors used in related research summarized followedby a summary of available sensors in the truck that will be used for testing. A combination ofthese will later be used to select appropriate sensors for the implementation phase. After thesesubsections is also a discussion of how available sensors in the test vehicle in theory will affectthe results for predicting driver intent and driver actions.

2.5.1 Sensors used in related research

For Salvucci’s mind tracking technique for predicting lane-changes were steering wheel angle,accelerator pedal position, future lane position (calculated from current heading) and time toleading vehicle used [9]. Pentland and Liu used longitudinal velocity, longitudinal acceleration,current gear and steering wheel angle for their HMM approach to detect lane changes [12].McCall et al. used accelerator pedal position, brake pedal position, longitudinal acceleration,longitudinal velocity, steering wheel angle, yaw rate, lateral acceleration, road curvature, head-ing, lateral lane position, side-to-side and up-down head movements using RVM to detect intentsto change lanes [20]. Doshi et al. tried to predict lane changes with RVM by using steering wheelangle, yaw rate, turn signals (and length of time activated), distance to leading vehicle, relativespeed of leading vehicle, gap in time to leading vehicle, difference between current and desiredspeed, lane position and road curvature. They also used eye- and head-tracking to measurehead and eye movements [19]. The SVM approach by Salvucci and Mandalia used only sevenfeatures for their lane-change prediction. Lane positions at different distances (calculated fromcurrent angle towards lane markings) at 0, 10, 20 and 30 meters ahead, angle towards lane mark-ings, lateral- and longitudinal acceleration [16]. Berndt et al. used accelerator pedal position,brake pedal position, longitudinal acceleration, steering wheel angle, steering wheel velocity,yaw rate, curvature and type of road [15]. Trivedi et al. used steering wheel angle, wheel speed,longitudinal acceleration, lateral acceleration, yaw rate, brake pedal pressure, accelerator pedalposition, head movements, foot movements and distance to leading vehicle to detect intent tobrake [22]. Mabuchi et al. used distance from intersection, velocity and acceleration to detectintent to stop at an intersection with yellow light [18].

Map data has mostly been applied to stationary safety systems at intersections. Maps thathave been used for constraints on reachable points. However recent work have shown thatinformation in these road network maps are important indicators to predict vehicle behaviors

16

[10]. To meet this increasing demand of using map data the ADAS Horizon project was initiated.This project uses the ADASIS protocol for standardizing the communication of map data toADAS using CAN in vehicles [33].

The most recent studies has particularly shown the importance of measuring eye- (gaze)and head movements for predicting driver intents. Results by Henning et al. showed that aglance to the left mirror happened in 99% of all lane changes to the left and in 87% of the casesdid a glance to the left mirror precede a turn signal [21]. Related research by Tijerina et al.found that the probability was between 65-92% that a driver glances at the mirror prior to alane change and Olsen et al. concluded that these visual mirror searches last on average for1,1 seconds [1] [34]. Research by Boer et al. showed that approximately three seconds prior toa lane-change did drivers exhibit a significantly different context scanning behavior with morefrequent gazes to the rear view mirror at the expense of views at current lane. When the lane-change is initiated the gaze shifts from guiding features in the current lane to the destinationlane. The drivers also increased gaze at surrounding vehicles (front and back) before and duringthe lane change for situation awareness [35]. Studies by Doshi et al. also showed that headmotions were significant 4 to 1.5 seconds before a lane-change while lane deviation is moreimportant 2 to 0 seconds prior to the actual maneuver. Their resulting classifier was able topredict lane-changes up to 3 seconds prior to actual maneuver due to these head movements[19].

Another but not as convenient approach for measuring intent could be using electroen-cephalography (EEG) for measuring electric responses in the brain. Studies by Haufe et al.showed that an driver assistance system using EEG to detect intent for emergency brakingoccurs 130 ms earlier than systems only relying on pedal responses. At 100 km/h driving speedthis would yield a shortened braking distance of 3,66 meters [36]. The question whether thistype of sensor to detect electric responses in the brain is implementable in commercial vehiclesis another question.

In Table 2 the sensors used in related research are summarized and separated into the threecategories; driver, vehicle dynamics and environmental data. The longitudinal, lateral and yawaxis are defined in Figure 13.

Yaw Axis

Longitudinal Axis

Lateral Axis

Figure 13: Definition of longitudinal, lateral and yaw axis.

17

Table 2: Sensor data used for related research.

Category Sensor type

Vehicle dynamics Longitudinal and lateral velocityLongitudinal and lateral accelerationYaw rate

Driver behavior Eye movementsHead movementsFeet movementsSteering wheel angleSteering wheel velocityAccelerator pedal positionBrake pedal positionTurn signalsGearEEG

Environmental data Lateral lane position (current and future)Distance to leading vehicleDistance to following vehicles (current and adjacent lanes)Relative velocity of leading vehicleRelative velocity of following vehicleRoad curvatureMap data

2.5.2 Available sensors in the test vehicle

A majority of the sensors in the categories for car dynamics and environmental data used inrelated research also exist in truck that will be used for testing. There is a distance radar totrack distances and velocities to leading vehicles in both current and adjacent lanes (part ofACC system) and an LDW camera to detect distances and angle towards lane markings. Theonly exception is that there are no sensors to track vehicles behind and to the side of the trucks.

Looking at sensors for driver behavior, unfortunately there are no trackers for eye, head andfeet movements in the test vehicle. It should be noted, that these trackers were only availablein related research because specially equipped vehicles were used for the testing. They didnot use commercially available vehicles as the objective is for this thesis. As expected are alsoEEG technologies not available. Otherwise all other signals are present including retarder leverposition and reference speed for cruise control system. In Table 3 available systems and sensordata at Scania are summarized.

18

Table 3: Sensor data available in the test vehicle.

Category Sensor type

Car dynamics Longitudinal and lateral velocityLongitudinal and lateral accelerationYaw rate

Driver behavior Steering wheel angleAccelerator pedal positionBrake pedal positionTurn signalsCruise control reference speedRetarder position

Environmental data Distance to right and left lane markingAngle to right and left lane markingDistance to leading vehicleRelative speed of leading vehicleDistance to leading vehicles in adjacent lanesRelative speed of vehicles in adjacent lanesCurvature

2.5.3 Conclusions from related research and available sensors

After studying related research it is clear that the terms intent and predict has not been usedconsistently. Most of the studies have tried to detect initial stages of the lane-change maneuver(the action). Whereas only a few actually aimed to predict the lane-change (the intent) bylooking at the patterns in driver behavior prior to the lane-change action.

The objective for this thesis is to investigate how a driver’s lateral intents and actions mani-fest in sensor data. In other words, construct a system that can detect driver related patterns forlane-keep and lane-change actions and to predict lane-changes by detecting preparatory driveractions prior to the lane-change (the intent).

Results from related research have shown that with the current set-up of sensors in thetest vehicle there are good possibilities to test and classify between lane-keep and lane-changeactions. A task that requires sensor data from all three sensor categories, environmental data,vehicle dynamics and driver behavior.

The possibilities to predict a lane-change by looking at drivers preparatory actions, prior tothe actual lane change will be a more cumbersome task. For these actions, referred to as thelateral intent, has only significant patterns been found when using eye- or head-trackers. Nosignificant patterns were found in how the driver operates the vehicle prior to a lane-change, forexample, the driver positions the vehicle closer to the lane marking. Therefore, for this thesisand testing it will be required to infer the lateral intent using other cues. This implies thatthere will be less focus on the driver and more focus on the surroundings and current scenario.This introduces a natural uncertainty when decisions are being made on what the driver shouldprefer to do given the circumstances rather than using more direct cues of what the driverintends. This third approach will be referred to as inferred environmental intent.

Visualizing this trisect approach is made in Figure 14. In this approach and for the imple-mentation there will initially be three separate tasks. Beginning from the end is the first taskto classify between lane-change and lane-keep actions. Preceding lateral actions is the task todetect the lateral intent, in other words predict the lane-change. Even though the absence of

19

patterns found in related research without using eye- and head-trackers it is still worth verifyingthose results in this work. There could be differences since heavy vehicles are used for this the-sis compared to standard vehicles in previous research. The final task is then to determine thepossibilities and accuracies when predicting actions using only environmental data, the inferredenvironmental intent.

Figure 14: Dependency between inferred environmental intent, lateral intent and lateral actions.

The suggested approach might also be useful in the quest to solve the problem of detect-ing drowsy or distracted drivers. Because if the inferred environmental intent contradicts thedetected lateral actions it may imply distracted or drowsy actions. Secondly, by incorporatingdriver inputs such as steering wheel angle into the solution it may also be possible to excludelane-changes without the appropriate steering wheel patterns.

2.6 Choice of algorithm for the implementation

Results from related research have proven the strength of binary classification algorithms forthe tasks of detecting and predicting driver actions with adequate sensors.

The SVM algorithm has worked very well offering a robust and efficient approach for bothlateral and longitudinal studies. For detecting lateral actions using available sensors in the testvehicle has true positive rates up to 97.9% been achieved at 5% false positive rates using realvehicle data. The approach also proved to be quick with true positive rate of 87% within 0.3seconds from start of the maneuver. In a comparative work did an HMM approach only obtaintrue positive rates up to 80% at 5% false positive rates [17]. Other work using HMM withreal-world data has achieved 71% and 74% accuracy for detecting left and right lane changesalso using sensors available in the test vehicle [15]. An RVM approach obtained 80% accuracybut had problems with false positives [19]. This approach also used sensor information notavailable in the test truck.

The advantages with SVM are many. The method works well with small sets of trainingdata and a large number of inputs as will be beneficial for this thesis [24]. SVM is also robustto the over-fitting problem by using a cost function for finding a large classifier margin. Thisapproach also allows other mathematical approaches to be incorporated into the method. Forexample, it can be extended to include fuzzy sets thereby capturing real-world uncertainty witheach point belonging to different classes to some degree [37].

There are some drawbacks using SVM. One issue is that the method does not incorporatetemporal variations like HMM do. For example, a lane-change is not of fixed time length withlonger lane-changes having smoother transitions in sensor-values. However, this problem canbe solved by looking at windows of sample data. The continuous data stream is broken downto a window of fixed sample size. This window is then shifting for each addition of new datayielding a sample by sample (continuous) prediction. The best indication is then given bythe label of the last window [16]. Another drawback is the difficult choice of kernel, kernel

20

parameters and regularization parameters. Using results by Mandalia et al. this problem canbe simplified because simple kernels proved to be sufficient enough and appropriate parameterscan be found using grid- or pattern search [16]. Compared to RVM there is also no probabilisticdistribution for the output, only two discrete values, one for each class. SVM is also not optimalfor large amounts of feature vectors due to the transformation process to obtain the dual form asdescribed in Appendix A [38]. When using a fixed window of data this problem can be reduced.SVM also require more computational power than RVM, but with the most time-consumingoperations being made off-line this should not affect the performance when implemented in avehicle.

The advantages with RVM compared to SVM is that RVM have a reduced sensitivity toparameter settings, yields a probabilistic output and does not need a cost function as for SVM.The method can also sift through large inputs of data and still obtain sparse representation.This results in a less computationally intensive method that requires less relevance vectorscompared to number of support vectors for SVM. However when it comes to results has SVMactually proved to be slightly superior to RVM [23].

Combining available sensors, tasks to solve and results from previous research it was decidedto choose SVM for the implementation. Even though both HMM and RVM looks promising hasSVM yielded higher accuracy in comparative work which is important to reduce the numberof false positives. There is also more information and toolboxes available for SVM simplifyingthe testing and construction of the desired system. Summarized the reasons for selecting andcontinue with SVM are:

1. Method is suitable for binary classification problems.

2. Method works well with small sets of training data.

3. Method has yielded highest accuracy in comparative work with other methods.

4. Suitable for large sets of inputs.

5. Method have proven to quickly classify data.

6. There exist toolboxes for Matlab for fast prototyping.

Following this selection is a more extensive mathematical description of SVM explained inAppendix A.

21

3 Implementation

This chapter describes the implementation of the SVM algorithm for the off-line testing usingrecorded lane-keep and lane-change data. The first sections describe how sensor data wasacquired followed by a feature analysis. This chapter also describes the process of tuning theSVM algorithm to obtain highest possible accuracy.

3.1 Aquiring sensor data

All sensors in a Scania truck are connected to the built-in controller area network (CAN). Thereare three buses (red, yellow and green) which these systems are distributed on with main controlunit COO as connection node, see Figure 15.

Yellow

Green Red

External

COO

System System System System System

Figure 15: CAN-network and main control unit COO.

To acquire necessary sensor data for training of the algorithm, field trips were arrangedusing a Scania R440 truck and five different drivers, see Appendix B for vehicle specifications.A laptop with the software CANalyzer was connected to the external port to listen and logall CAN communication. Together with CAN data were videos of the scenario in-front of thevehicle recorded for all trips. The intent of these videos was to easily find the position of aspecific maneuver in the CAN data.

When driving a semi-trailer truck on motorways, events of interest (lane-changes) rarelyoccurs. This is because the relatively low speed compared to other vehicles. To increase theprobability of natural lane-changes it was decided to collect data using only the truck withouttrailer. This change allowed the drivers to drive at 90 km/h compared to 80 km/h when drivingwith a trailer. Descriptions of the five trips can be viewed in Table 4.

Table 4: Summary of field trips for recording sensor data.

Trip Driver, Age Distance Duration Road, Route

1 Male, 29 26,4 km 25:53 E4, Sodertalje to Telefonplan

2 Male, 29 11,8 km 08:08 E4, Telefonplan to Botkyrka

3 Male, 32 53,0 km 34:35 E4, Sodertalje to Vagnharad (with return)

4 Male, 50 26,2 km 18:11 E4, Sodertalje to Telefonplan

5 Male, 42 26,5 km 23:34 E4, Sodertalje to Vagnharad

This resulted in 143,9 km and 110 minutes of effective motorway driving. It yielded in7 lane-changes due to entrance ramps, 7 lane-changes due to exit ramps and 29 lane-changesduring normal motorway driving.

22

3.1.1 Defining test and training data

To train and test the SVM algorithm it is required to have specific data for this purpose. Fromthe large CAN logs were smaller subsets of data extracted defining lane-change and lane-keepsamples for lateral actions; preparatory lane-change and lane-keep samples for lateral intent;and subsets defining scenarios prior to lane-change and lane-keep actions. These subsets aredefined by using timestamps in the CAN logs. Each subset was also given a label for theclassification process, -1 for lane-keep and +1 for lane-changes.

Looking at lateral actions it is required to define the lane-keep and lane-change maneuvers.A lane-change action is defined as the movement within the current lane when the ego-vehiclestarts to steer towards the destination lane until the first wheel crosses the lane-marking. Thisis independent of the initial position in the lane before the maneuver. Given this definitionthe 43 lane-changes did result in an average lane-change of 2.3 seconds. The longest was 6.0seconds and the shortest 0.6 seconds. Lane-keep data is defined as the data not resulting ina lane-change. This is the majority of all collected data and resulted in approximately 108minutes of lane-keep driving.

For lane-changes, the end of preparatory actions is when the driver commence the actuallane-change. Determining the start of preparatory actions without access to eye- or head-trackers is not as trivial. Based on results from research using these trackers it was decided touse a fixed start-time, 5.0 seconds prior to a lane-change [19]. Similarly as for lateral actionsare ”preparatory” actions not resulting in lane-changes selected for the lane-keep samples.

Defining data for inferred environmental intent is similar to how data was selected for lateralintent. For this task it is of interest to determine the scenario prior to a lane-change or lane-keep action. Therefore are scenarios for lane-changes defined as the time between the start of alane-change and 5 seconds prior it. Scenarios for lane-keep are selected as all other.

3.2 Analysis of sensor data

The sensor signals of interest, from Section 2.5.2, for each training and test sample of datawere exported from CANalyzer to Matlab for analysis and processing before the classification.Depending on the origin of a sensor signal there are some important properties. Sensor datafrom ACC radar and LDW camera sensor are sometimes discontinuous, see example for LDWcamera in Figure 16. This discontinuity occurs when these sensors for some reason lose trackingof their targets, e.g. ACC radar can not see any target in front of the ego-vehicle or the LDWcamera can not distinguish any lane markings. When this happens these sensors sends out CANmessages as usual but with values defined by an upper-boundary. These values are thereforemuch larger than normal values making the signal resemble a discontinuity. Except for signalsfrom these systems can all other sensor data be considered as continuous.

23

330 335 340 345 350 3550

100

200

300

400

500

600

Time [s]

Distance

toLeftLaneMarking[cm]

Distance to Left Lane Marking

(a) Distance to left lane marking with discontinuities.

330 335 340 345 350 355

−0.2

−0.1

0

0.1

0.2

0.3

Time [s]

SteeringW

heelAngle[rad

]

Steering Wheel Angle

(b) Steering wheel angle without discontinuities.

Figure 16: Two signals showing the necessity to handle discontinuities.

When defining lane-keep and lane-change data for the training samples it is of importanceto avoid these discontinuities. This is to guarantee that impracticable data will not affect theperformance of the classifier. For a real implementation it will be necessary to handle thesediscontinuities on the run before data is sent to be classified. One approach is to introducestatus flags for each sensor. This is to control at each new classification which sensor data isavailable. For a continuous classification process this implies that different decision surfaces willbe required depending on the status flags. Thus, different decision surfaces will yield differentaccuracies. It is also possible to transform these status flags and think of them as inputs to thesystem, e.g. a discontinuity for the ACC system denotes that there is no vehicle in-front of theego-vehicle.

Preceding the classification process all inputs are also scaled down to between -1 and 1 tosimplify the calculations. An operation that will be necessary both during training and testingsince it is required that the inputs have the same properties.

The signal for longitudinal acceleration with a sampling rate of 50 Hz had a lot of highfrequency components (noise), components that were filtered out to highlight data of interest.The noise comes from vehicle motions, vibrations and the sensor itself. A Butterworth low-pass filter was added to dampen frequencies above the chosen cutoff frequency of 5 Hz. Thisintroduced a delay for the filtered signal (of approximately 0.1 seconds) but the gain of havinga smooth signal was considered more important in order to capture significant changes rather

24

than small instantaneous ones.

3.3 Enhancement of sensor data

The original signals presented in Section 2.5.2 can be combined and extended using expertknowledge to better and more compactly present the sensor data. Three new signals are theresults from this transformation; current lane position, future lane position and difference fromreference speed. Each of these will be described in the following subsections.

3.3.1 Current lane position

To calculate the current lane position are three signals from the LDW camera used; distancefrom left and right lane marking and lane width. The distances to lane markings are calculatedfrom the center point of the truck. A schematic view of these signals is seen in Figure 17. Inthis figure is also the angle towards left lane marking showed. A signal required for calculatingthe future lane position being described in Section 3.3.2

αLANE

dRIGHT

dLEFT

w

Figure 17: Schematic view of the lane width, distance to right and left lane marking and angleto left lane marking.

If only one lane marking is visible it is not possible for the LDW camera to determine thelane width. For these situations it is required to use a representative value. From collectedfield trip data could this representative value be calculated to 363 centimeters for a normalmotorway lane in Sweden. Using this information the current lane position can be calculatedas the quotient of the distance from right lane marking and lane width,

posc =drightw

(15)

with w being the measured or mean lane width and dright being the distance from right lanemarking. Depending on if only one or both lane markings are visible there are three cases forcalculating the distance from right lane marking,

dright =

dright if only right lane marking is visible

wmean − dleft, if only left lane marking is visible

dright if both lane markings are visible

(16)

with dleft and dright denoting the distance from lane markings and wmean being the mean lanewidth.

25

3.3.2 Future lane position

It is desired to approximate the future lane position of the vehicle t seconds in the future.This signal would be a good indicator to detect if the current vehicle motions could result ina lane-change. It can be viewed as simple version of a time to line crossing (TLC) signal. Toapproximate this future lane position, the signals; current vehicle speed, yaw rate, current angletowards lane markings, road curvature and time t are used. For the implementation was a timeof t = 2.6 seconds used. A value chosen because it represents the average time for a lane-change.

This problem then becomes solving a trigonometrical problem. First is the future vehicleposition in reference to current lane markings determined,

posv = posc +v · t · sin(α+ ω · t)

w(17)

with posc from Equation 15, t is the selected future time, v is the vehicle speed, α is the angletowards the current lane markings, ω is the yaw rate and w is the lane width. With the futurevehicle position known it is also required to determine where the lane markings will be giventhe current curvature. The curve radius is calculated with,

rcurve =1

κ(18)

with κ being the curvature. The lateral movement of the curve can then be calculated with

δcurve =rcurve −

√r2curve − l2w

(19)

with l being,

l = v · t · cos(α+ ω · t) (20)

The future lane position given the updated position of lane markings can now be calculatedaccording to,

posf = posv − δcurve (21)

3.3.3 Difference from desired speed

This new signal combines the difference between the current speed and desired speed. Thiswould serve as a good indicator when determining the inferred environmental intent. As ex-plained earlier is the driver assumed to always try to obtain the maximum allowed speed. Whengathering data for this implementation the desired speed was 89 km/h since the test vehiclewas electronically limited at this level.

vdiff = va − vc (22)

with va being the desired velocity and vc is the current velocity. For an real implementation itwill also be necessary to adjust the desired speed in regard to the inclination of the road andweight of goods. If the ACC is activated this system will determine the desired speed.

26

3.4 System design

The goal with this off-line classification system is to resemble the required continuous point-to-point system that will be necessary in real vehicles. Therefore it is not sufficient to onlyrecognize a maneuver when it is completed or when large blocks of its is completed. This canbe compared to speech recognition where we want to detect and classify each syllable ratherthan looking at whole words or sentences. To obtain this point-to-point accuracy, a movingwindow is used that shifts through the training and test data, see Figure 18. By using a movingwindow of fixed size can an dependency between sensor values be calculated, a dependency thatshifts as the windows shifts. If only the latest value is required, it can be extracted from thewindow without calculating the dependency.

t1

t2

t0 Stream of sensor CAN messages

Window of data

Figure 18: To achieve point-to-point recognition a window of limited size shifts through thestream of data.

When classifying lateral actions with frequently changing data it is desired to use thisdependency between sensor values. This is because it is more interesting to see how thesevalues changes rather than the instantaneous values [16]. An approach that should increase thelikelihood of capturing lane-changes when driving through curves. To obtain this dependency isthe variance calculated for each window. The variance determines how far sensor measurementsare spread out from a mean value thus effectively capturing changes. To capture multiplechanges within a window it is possible to divide it into multiple blocks. Blocks that can beoverlapping or sequential. For the testing will initially three different sizes of windows be used,1.0, 1.5 and 2.0 seconds with two sequential blocks per window. The variance is calculated with

variancex =1

N

N∑i=1

(µx − xi)2 (23)

with N being the number elements (sensor values) in the block, µx is the mean of the windowand xi is the sensor value at sample point i. Calculating the variance for a window furtherreduce the amount of data to be classified. Another benefit using the variance is that problemsof misinterpreting driving through curves should be reduced since the mean of window helps toeliminate these errors.

3.4.1 Elements of the system

The test system is planned to work according to the flowchart in Figure 19. Sensor data isreceived from CAN messages and the window of data is shifted one step. For the off-line testingis this equal to feeding the system with logs (or samples) of CAN data. This is followed by ananalysis and transformation block that scans the window to see if there are any discontinuitiesin the data. If required are discontinuous samples blocked for the rest of the process. At thisstage is also sensor data enhanced and scaled. With available data known can optimal sensor

27

sets be selected for each task. Sensor sets that are fed to the SVM classifiers to detect andpredict lane-changes.

Start

Recieve CAN messages or

CAN logs for off-line testing.

Select optimal sensor sets for

each task.

Classify:

Lane-change or

Lane-keep

Check last two

values for left or

right lane-change.

End

Transmit detections and

predictions.

Compare current classification

results with previous ones.

Classify:

Lane-change left

or Lane-keep

Classify:

Lane-change right

or Lane-keep

Combine

classifications.

Lateral Action

Classify:

Lane-change or

Lane-keep

Check lane position

or turn signal for

direction.

Lateral Intent

Feature analysis to detect

discontinuties and process to

enhance sensor data.

Inferred Environmental Intent

Figure 19: Flowchart of the system design.

As seen in Figure 19 there is a multi-class problem when determining the inferred envi-ronmental intent. This requires to solve two binary problems (one versus one) and combiningthe results. The reason why this is not needed for lateral actions is because the variance iscalculated for the data in those sets. The variance squares the values thus removing the signs.Therefore can left and right lane-changes be equalized for the classification process. If the algo-rithm outputs a lane-change the direction can be determined by looking at the two latest valuesaccording to how a lane-change is defined for this thesis. A similar approach is used for lateralintent but here is the lane position (or turn signals if used) used to determine the direction.

Instead of communicating the results from each classification directly is it also desired tocompare the classifications with previous results. This is because the likelihood of correctpredictions increases as the maneuver progress. The last and final step is to communicate theresults by sending CAN-messages before the process repeats.

3.5 Sensor sets

Rather than using all available sensor data for the classification process it is better to find anoptimal subset of sensor data that yields the highest accuracy. This is to reduce computational

28

load but also to remove any sensor that may not be beneficial for the specific task.Depending on if there are discontinuities in the sensor data it is also required to calculate

different decision surfaces to cover for these situations. The accuracies for all these differentsets will be presented in Chapter 4.

3.5.1 Sensor sets for detecting lateral actions

When classifying lane-change and lane-keep actions it was decided to calculate the accuracyfor the sensor sets presented in Table 5. These seven sensor sets were selected to both coverscenarios with and without the LDW camera sensor. The selection was also based on sensorsthat should yield noticeable changes in their variance during lane-keep and lane-change actions.

Longitudinal acceleration was used instead of the accelerator and brake pedal position.Partly because this indirectly measure these two, secondly it covers more scenarios. For example,when the ego-vehicle accelerates on a downhill slope without the driver changing the acceleratorpedal position. Another reason is because the ACC system was used during a majority of therecorded trips.

If a sensor signal in Table 5 is followed by a (V ) the variance for the blocks in the windowsis calculated, if it is an (X ) the last value of the windows is used.

Table 5: Sensor sets for detecting lateral actions.

Set Sensor signals

A.1 Steering Wheel Angle (V), Yaw Rate (V), Lateral Acceleration (V), Longitu-dinal Acceleration (V)

A.2 Steering Wheel Angle (V), Yaw Rate (V), Lateral Acceleration (V)

A.3 Steering Wheel Angle (V), Yaw Rate (V), Lateral Acceleration (V), Longi-tudinal Acceleration (V), Current Lane Position (V), Current Lane Position(X), Future Lane Position (V), Future Lane Position (X)

A.4 Steering Wheel Angle (V), Yaw Rate (V), Lateral Acceleration (V), CurrentLane Position (V), Future Lane Position (V)

A.5 Steering Wheel Angle (V), Lateral Acceleration (V), Current Lane Position(V), Future Lane Position (V)

A.6 Steering Wheel Angle (V), Current Lane Position (V), Future Lane Position(V)

A.7 Steering Wheel Angle (V), Current Lane Position (V), Current Lane Position(X), Future Lane Position (V), Future Lane Position (X)

3.5.2 Sensor sets for predicting lateral intent

Classification of lateral intent have proved to be a difficult task without eye- or head-trackers.From the selection of sensor sets for lateral action were two similar sets chosen for lateral intent,see Table 6. For these sets were focus on eventual patterns in lane position, steering wheel angle,longitudinal acceleration and turn signals.

29

Table 6: Sensors sets for predicting lateral intent.

Set Sensor signals

B.1 Steering Wheel Angle (V), Longitudinal Acceleration (V), Current Lane Po-sition (V), Current Lane Position (X)

B.2 Steering Wheel Angle (V), Longitudinal Acceleration (V), Current Lane Po-sition (V), Current Lane Position (X), Turn Signals (X)

3.5.3 Sensor sets for inferred environmental intent

The approach for classifying inferred environmental intent is a modified and simpler version ofthe flowchart in Appendix C. This is because sensor data for a fully functional flowchart wasnot present in the test vehicle.

Two different sets were selected for this task, see Table 7. The first set was selected to classifybetween left lane-change and lane-keep. This set simply show if the driver should change lane tothe left due to a slow leading vehicle and the relative velocity in the left adjacent lane. For thisapproach to be reasonable it required the addition of a new signal lane-change allowed basedon the signals current lane and number of lanes for the road segment. This is to eliminatesituations when the classifier would suggest a lane-change but it is not possible since the vehicleis already in the outer lane. Current lane and number of lanes were signals not available in thetest vehicle but since these can easily be obtained using GPS systems and some additional logicthese were manually added to each training and test sample.

The second set is very similar to the first one with the difference that now is the goal todetermine if the driver wants to change lane to the right. For example, after a passing as seenin Figure 12k in Section 2.4.

Table 7: Sensor sets for inferring environmental intent.

Set Sensor signals

C.1 Distance to Leading Vehicle (X), Relative Velocity of Leading Vehicle (X),Difference from Desired Speed (X), Relative Velocity in Left Adjacent Lane(X), Left Lane-change Allowed (X)

C.2 Distance to Leading Vehicle (X), Relative Velocity of Leading Vehicle (X),Difference from Desired Speed (X), Relative Velocity of Vehicles in Right Ad-jacent Lane (X), Right Lane-change Allowed (X)

It should be noted that the relative velocity is solely based on leading vehicles in currentand adjacent lanes and this approach will not account for vehicles approaching from behind.

3.6 Classifier design

The SVM algorithm is created using a library of functions created by Steve Gunn [39]. Toimplement this algorithm is the classifier design suggested by Chih-Wei Hsu et al. followed [40],

1. Transform data to the format specified by the SVM algorithm

2. Conduct scaling on input data

3. Choose SVM kernel function

4. Use cross-validation to find optimal parameters

30

5. Use optimal parameters to train entire set

6. Test

The tuning of the SVM algorithm is important for the overall performance. The goal is toobtain high accuracy when classifying new data, not necessarily a solution perfect at classifyingtraining data. Tuning the algorithm involves selecting an appropriate kernel as described inAppendix A. This is a difficult task and a perfect kernel may never be found. Therefore is theRBF (radial bias function) kernel a reasonable first choice. It nonlinearly maps inputs to higherdimensions suitable for nonlinear tasks. It also only has two adjustable parameters, the costparameter λ and model parameter σ.

There are two approaches for finding optimal parameters, pattern or grid search. A gridsearch looks at a range of values using a fixed step. First is a coarse search made to find theoptimal region, if this region exist a finer search using smaller steps is used. A pattern searchstarts at the middle of the search range for each parameter and tests steps at both directions.If the model improves the search center is moved to this new point, otherwise is the step sizereduced and a new search is commenced. The search continues until the step size reach a giventolerance. [41]

The tuning process may require substantial computational power. For example, given agrid search for two parameters with 10 data points each results in a computation of 100 gridpoints. As the number of parameters and search interval increases the computationally burdenincreases drastically. The pattern search method generally require less computations but itmay only find a local extrema rather than a global one. Therefore is the grid search selectedfor its straightforward approach and guarantee that a global region will be found. The coarsegrid search is performed by stepping through the following values for the cost function λ =10−1, 10−2, ..., 1017 and RBF parameter σ = 10−3, 10−2, ..., 103.

In combination with the grid search is cross-validation used to further optimize the trainingof the algorithm. With cross-validation is the data partitioned into k equally sized segments.Sub-sequentially are k iterations performed and for each iteration is one segment used for testingand the other k-1 are used for training. Using cross-validation it is possible to better estimatethe performance of the algorithm and find better parameters for the model. Cross-validationincrease the gridsearch with a factor k.

In Appendix D is a semi pseudo-code for the Matlab script used for the gridsearch. In thisappendix is also the script to create ROC graphs as will be explained in next subsection.

3.6.1 Introduction to ROC graphs

Receiver operating characteristics (ROC) graphs is a method for visualizing the performance ofclassifiers. This method was originally used in medical decision making but have been adoptedto the field of machine learning and pattern recognition. [42]

A binary classifier with inputs being mapped to one of two discrete classes will result in fouroutcomes depending on the predicted class and actual class. The predicted class is here denotedas {p, n} while {P,N} is for the actual class. If an input is positive (P) and classified as positive(p) it is counted as a true positive, but if it is classified as negative (n) it is counted as falsenegative. If the input is negative (N) and classified as positive (p) it is counted as false positive,but if it is classified as negative (n) it is counted as true negative. This can be represented witha contingency table, see Table 8.

31

Table 8: Contingency table for ROC curves.

p nP True Positive False NegativeN False Positive True Negative

From the contingency table can four important metrics be calculated,

false positive rate (FPR) =False positives

Total negatives=

False positives

False positives + True negatives(24a)

true positive rate (TPR) =True positives

Total positives=

True positives

True positives + False negatives(24b)

precision =True positives

True positives + False positives(24c)

accuracy =True positives + True negatives

Total positives + Total negatives(24d)

In literature true positive rate is also known as recall.A ROC graph is a two dimensional graph showing the relation between TPR and FPR, see

Figure 20. The point (0,1) represent perfect classification with 100% true positive rate and 0%false positive rate. Generally points in the upper left corner of the ROC graph make positiveclassifications with strong evidence to reduce the number of false positives. In comparison,points (classifiers) in the upper right corner make positive classifications with less evidenceresulting in higher rates of false positives. A classifier that ends up in the lower right cornerperforms worse than guessing the outcome. [42]

0 0.2 0.4 0.6 0.8 1.00

0.2

0.4

0.6

0.8

1.0

False positive rate

Truepositiverate

Figure 20: ROC graph with the performance of some example classifiers. Notable is the point(0,1) equivalent to perfect classification.

32

4 Results

This chapter presents the results for the classifications for the three tasks lateral actions, lateralintents and inferred environmental intent, using the implemented SVM algorithm and recordedsensor data. It shows how different parameters and sensor sets affect the performance and thecomputational requirements.

4.1 Classification of lateral actions

The gridsearch using four fold cross-validation to determine optimal kernel parameter σ, costparameter λ and windowsize γ for lateral actions required in total 9604 combinations to becalculated. The training set consisted of 21 lane-changes of varying size and a larger amount oflane-keep samples.

Initially three windowsizes, 1.0, 1.5 and 2.0 seconds, were selected for the coarse gridsearch.Results from using these windowsizes, presented in Table 9, showed that the highest accuracywas achieved for set A.5 with γ = 1.0 seconds. In attempt to improve these results using afiner gridsearch were three more windowsizes tested, 0.5, 0.75, 1.25 seconds. With γ = 0.75was a new highest accuracy achieved with 75.8% for sets A.4 and A.5, see Table 9. Both theseaccuracies were obtained with parameters σ = 0.001 and λ = 105.

Set A.1 and A.2 are the only sets without signals from the LDW camera. In other words,if the LDW camera malfunctions it is these two sets that has to be used for detecting lateralactions. For the initial selection of windowsizes did the accuracy for these seem to improvewith increasing windowsizes. Therefore were additional windowsizes of 2.5, 3.0 and 3.5 secondsadded to test this pattern. This resulted in improved accuracies and the highest for set A.1with 61.7%.

Table 9: Training accuracy after gridsearch for sensor sets A.1 to A.7 for lateral actions.

Accuracy [%]

Windowsize [s] A.1 A.2 A.3 A.4 A.5 A.6 A.7

0.50 54.7 52.6 67.8 71.6 72.2 71.6 67.2

0.75 51.9 54.0 68.8 75.8 75.8 74.8 68.2

1.0 55.3 55.9 69.0 74.7 75.1 73.6 70.0

1.25 54.7 55.5 67.0 72.2 72.0 72.4 69.0

1.5 57.4 57.0 67.4 71.8 70.4 70.0 69.3

2.0 59.6 59.1 69.7 73.0 73.2 71.8 70.5

2.5 60.2 59.4 69.4 71.7 70.4 69.6 69.4

3.0 61.0 59.6 70.3 70.6 70.6 70.5 71.0

3.5 61.7 60.4 70.9 69.7 69.7 69.1 68.5

With parameters found for each combination of sensor set and windowsize were the classifiersonce again trained. This time with the whole training set without cross-validation, according tothe classifier design in Section 3.6. This was followed by a test using data with unknown targets(for the algorithm) consisting of 22 lane-change and 22 lane-keep samples. The performance ofthe classifiers from this test are presented in Figure 21 using a ROC graph.

33

0 0.2 0.4 0.6 0.8 1.00

0.2

0.4

0.6

0.8

1.0

False positive rate

Truepositiverate

Min FPRMax TPRMax AccuracyMax precision

Figure 21: Performance of the classifiers from Table 9 using a test set of data.

In Figure 21 the four metrics minimum FPR, maximum TPR, maximum precision andmaximum accuracy can be seen. Set A.3 (γ = 0.5) had the lowest FPR with 17.6% at 72.1%TPR. Set A.5 (γ = 0.75) had the highest TPR with 86.0% at 27.7% FPR. Maximum accuracywas achieved using set A.4 (γ = 0.5) with 81.6 % at 84.3% TPR and 21.2% FPR. Maximumprecision was obtained with set A.3 (γ = 0.5) with 80.4% at 72.1% TPR and 17.6% FPR.

A general conclusion from the ROC graph in Figure 21 is that all classifiers suffers from highfalse positive rates. The main reason for this comes both from the definition of a lane-changeas well as the characteristics of some sensor data. In Figure 22a the lane position is plotted asa function of time and the start of the lane-change is marked using a dotted line. In generaland for this example it can be seen that early stages of a lane-change does not differ much fromlane-keep actions resulting classification difficulties. Difficulties that leads to the high number offalse positives. In Figure 22b it can also be seen that the steering wheel angle does not displaysignificant characteristics for the lane-change. These problems also led to a high number of falsepositives. Similar lack of significant characteristics proved to also exist for lateral acceleration,yaw rate and longitudinal acceleration. The reason for this are the smooth transitions in sensorvalues during the lane-changes. Especially when driving a truck, since these cannot as easy asa car ”jump” into the other lane due to their size and limited acceleration.

1650 1660 1670 1680 1690 1700 1710 1720−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]

Lane PositionStart of Lane−change

(a) Lane position as function of time.

34

1650 1660 1670 1680 1690 1700 1710 1720−0.4

−0.2

0

0.2

0.4

Time [s]

SteeringW

heelAngle

[rad]

Steering Wheel AngleStart of Lane−change

(b) Steering wheel angle as function of time.

Figure 22: Characteristics of lane-position and steering wheel angle before and during a lane-change.

4.1.1 Time until correct classification

This type of classification system does not only require high TPR and low FPR it should alsoquickly be able to make new classifications. To get a rough estimate of the time needed foreach classification the computations were measured in Matlab. For these measurements wereprecomputed decision surfaces used to classify a set of data. Four classifiers were chosen, theones that yielded minimum FPR, maximum TPR, maximum accuracy and maximum precisionin Figure 21. The hardware for the test consisted of a Intel Core2 Duo processor SU7300(1.3GHz) with 4GB of RAM memory running Matlab 64-bit version 7.12.0. The required timein milliseconds for each classification is presented in Table 10.

Table 10: Required time to compute a new classification.

Type Set, Windowsize [s] Time [ms]

Min FPR A.3, 0.50 53

Max TPR A.5, 0.75 54

Max Accuracy A.4, 0.50 53

Max Precision A.3, 0.50 53

As expected did the computations for classifiers using the same windowsize require an equalamount of time. The larger windowsize (γ = 0.75) required in general 1 millisecond more timefor every classification. For this system it allowed almost 19 classifications per second. A ratethat can be improved using better hardware and optimized software.

In Figure 23 is a typical lane-change visualized by plotting the lane position as a function oftime. In this figure is also the result of the classifications shown, × denotes when the classifierpredicts lane-keep and • is used when the classifier predicts a lane-change. The algorithm inFigure 23 makes a classification every 100 milliseconds.

35

2520 2520.5 2521 2521.5 2522 2522.5 2523 2523.5 2524 2524.5 2525−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]

Lane PositionStart of Lane−ChangeFront Wheel Touches Lane Marking

Figure 23: The classification output as the lane position changes. Predicted lane-keeps aremarked with × and • is used for lane-changes.

How to determine the time until the algorithm correctly classifies a lane-change can bedefined in multiple ways. The definition used here is the time until the classifier only predictsa lane-change without any interrupts. In Figure 41 would this be after 0.55 seconds. It shouldbe noted that this definition is only applicable for off-line analysis. An implemented systemin a vehicle would require more evidence due to the high rates of false positives (and thatfuture classifications are unknown). In Appendix E can more examples of both good and badclassifications outputs be seen.

With the temporal variation for lane-changes it is better to determine how far into themaneuver, in percent, it is until the classifier correctly detects it. In other words, the timeuntil detection divided by the total time for the maneuver. A lane-change is here defined asthe movement within starting lane until the first wheel touches the lane marking. In Figure 23is the classifier correctly detecting the lane-change after 33% of its completion. A summary ofthe quickness for the sensor sets and windowsizes in Table 10 are presented in Table 11.

Table 11: Average percent of completed lane-change until correct classification.

Type Set, Windowsize [s] Mean [%]

Min FPR A.3, 0.50 31.4

Max TPR A.5, 0.75 17.2

Max Accuracy A.4, 0.50 21.8

Max Precision A.3, 0.50 31.4

The results in Table 10 are highly logical. Set A.3 with windowsize 0.5 seconds have thelowest FPR thus requiring more evidence before it classifies data as lane-change. In comparison,set A.5 with windowsize 0.75 seconds have the highest TPR (and FPR) requiring less evidencebefore classifying data as a lane-change.

36

4.1.2 Approach to reduce the false positive rate

For this system to be effective it is required to significantly lower the false positive rate. Other-wise would this system frequently communicate to cooperating ADAS that the driver performsa lane-change when it in fact are lane-keep actions. A problem that would cause more annoy-ance than if the system missed to early recognize an actual lane-change. An attempt to lowerthe false positive rate is to increase the required evidence before the classifier outputs a lane-change. One way to do this is to move the defined start time of a lane-change for the trainingset closer to the end of the maneuver (introducing a ”delay”). This gives a trade-off betweenmore evidence and delayed opportunities to classify a lane-change.

To test this approach are three sets, A.3, A.4 and A.5, tested with two windowsizes, 0.5and 0.75 seconds, for the delayed start times. First was a gridsearch conducted to determinethe kernel parameter σ and cost parameter λ when the delay α moves the start-time 20,40,60and 80% closer to the end-time of the maneuver. It should be noted that this delay will affectstart-times differently due to the varying size of lane-changes.

The results from the training set in Table 12 display that the training accuracy increaseswith increasing evidence. Or in other words, when the difference between lane-keep samplesand lane-change samples increases is the classification accuracy increasing.

Table 12: Training accuracy after the gridsearch with delayed start-times for set A.3, A.4 andA.5.

Accuracy [%]

Windowsize 0.5 Windowsize 0.75

Delay [%] A.3 A.4 A.5 A.3 A.4 A.5

20 66.2 76.8 75.9 67.6 76.4 76.9

40 69.3 81.9 82.4 72.2 83.3 83.6

60 78.3 88.4 87.8 79.2 90.3 90.0

80 90.3 94.0 94.0 92.1 92.5 92.5

After the gridsearch were these new classifiers, as previously, trained using the whole trainingdata set before being tested using data with unknown targets. The performance is presented inFigure 24.

0 0.2 0.4 0.6 0.8 1.00

0.2

0.4

0.6

0.8

1.0

False positive rate

Truepositiverate

Max AccuracyMax precisionMin FPRMax TPR

Figure 24: Performance of classifiers in Table 12 with delayed start-times for lane-changes.

37

Figure 24 displays how the FPR has been reduced and the TPR is in general higher for theselected classifiers. The lowest FPR was achieved with set A.4 (α = 80, γ = 0.5) with 8.0%at 84.0% TPR. The highest TPR was obtained with set A.4 (α = 60, γ = 0.75) with 92.6%at 13.5% FPR. Highest accuracy was achieved with set A.4 (α = 80, γ = 0.75) with 91.1% at8.0% FPR and 82.1% TPR. Highest precision was obtained with set A.3 (α = 40, γ = 0.75) at11.4% FPR and 68.9% TPR.

Results proving that if lowered false positive rates are desired the trade-off of having quickdetections is greatly reduced. However, even with large α-values was the FPR still 8% or higher.FPR rates that would yield problems if implemented in real vehicles.

4.1.3 Impact of including turn signals for detecting lateral actions

When classifying lateral actions turn signals has been excluded up until now. Partly becausethey have been viewed as a part of lateral intent and it was of interest to see the performanceof the algorithm without them.

To test the effect of introducing turn signals into the solution were the sets A.3, A.4 and A.5extended to include them. These extended sensor sets were then tested with three windowsizes,0.5, 0.75 and 1.0 seconds. Using the same procedure as previously the combinations were trainedusing a gridsearch to find optimal parameters, see Table 13 for results. Followed by a trainingsession using the whole set of training data before testing the classifiers with a set of test data.The results are presented in Figure 25.

Table 13: Training accuracy after gridsearch for sets A.3, A.4 and A.5 now including turnsignals.

Accuracy [%]

Windowsize [s] A.3 A.4 A.5

0.5 80.1 83.0 83.0

0.75 84.7 83.1 83.5

1.0 79.9 80.1 80.1

0 0.2 0.4 0.6 0.8 1.00

0.2

0.4

0.6

0.8

1.0

False positive rate

Truepositiverate


Figure 25: Performance of sets A.3, A.4 and A.5 now including turn signals.

Introducing turn signals into the sensor sets improved the performance as could be expected.

38

Set A.3 for γ = 0.5 had both the lowest FPR with 11.0% and highest TPR with 97.2%. A.3also had the highest accuracy and precision with 93.1% and 90.0%. On average were theperformance of the classifiers improved with lowered FPR and increased TPR. The reason whya 100% accuracy was not obtained is because the usage of turn signals was not synchronizedwith the lane-change maneuver. Sometimes it was activated prior to the start of the maneuver,sometimes after. There was also situations when it was not activated at all or disabled beforethe completion of the lane-change.

Looking at sensor sets A.1 and A.2 extended with turn signals is a similar improvementachieved. In Table 14 it can be seen that the training accuracy has increased just about asmuch as for set A.3, A.4 and A.5. The performance using testdata is presented in Figure 26.

Table 14: Training accuracy after gridsearch for sets A.1 and A.2 extended with turn signals.

Accuracy [%]

Windowsize [s] A.1 A.2

2.5 70.4 68.5

3.0 69.3 66.3

3.5 68.7 64.6

0 0.2 0.4 0.6 0.8 1.00

0.2

0.4

0.6

0.8

1.0

False positive rate

Truepositiverate


Figure 26: Performance of sets A.1 and A.2 extended with turn signals.

For set A.1 and A.2 now including turn signals was the lowest FPR achieved for set A.1(γ = 2.5) with 18.6% at 75.6% TPR. The highest TPR was achieved with set A.2 (γ = 2.5)with 76.1% at 26.9% FPR. Highest accuracy and highest precision was obtained for set A.1(γ = 2.5) with 76.3% and 76.7% at 75.6% TPR and 18.6% FPR.

4.2 Classification of lateral intents

The approach to classify lateral intent is similar to how the results were obtained for lateralactions. A gridsearch with four fold cross-validation is used to find the kernel parameter σ, costparameter λ and windowsize γ. In addition to these there is a metaparameter ξ defining thetime range before a lateral action that should be considered as the time for the intent. This

39

parameter ξ was held fixed at 5 seconds for all computations. The training accuracy from thegridsearch using samples for lane-keep and lane-change intent is presented in Table 15.

Table 15: Training accuracy for sets B.1 and B.2 for lateral intent.

Accuracy [%]

Windowsize [s] B.1 B.2

0.5 51.1 62.0

1.25 50.7 60.1

2.0 50.6 60.0

As expected there are no specific patterns for set B.1 and no significant ones for B.2 priorto a lane-change. The ambiguity between lane-change and lane-keep data at this stage is tobig. Extending the first set with turn signals to get set B.2, increased the training accuracy.The increase was still less than expected since turn signals were used in 95.2% of the recordedlane-changes. The reason for this is that the turn signals were not activated up to 5 secondsprior to a lane-change. Instead they were on average activated 1.1 seconds prior to the lateralactions resulting in the change.

The performance of these classifiers when tested against new sets of data verifies this be-havior, see Figure 27. This further supports the necessity to use eye- or head-trackers for thisthis task.

0 0.2 0.4 0.6 0.8 1.00

0.2

0.4

0.6

0.8

1.0

False positive rate

Truepositiverate


Figure 27: Performance of set B.1 and B.2 for lateral intent.

For comparison with results for lateral actions did set B.2 (γ = 0.5) have both the lowestFPR and highest TPR with 35.2% at 61.7% TPR. Set B.2 (γ = 0.5) also had the highestprecision and accuracy with 63.0% and 63.3%.

4.3 Classification of inferred environmental intents

Classification for inferred environmental intent use the same approach as for the other tasks. Agridsearch with four fold cross-validation is used to find the kernel parameter σ, cost parameter λand windowsize γ. For this task are only the last values of the window of importance. Thereforecan an arbitrary windowsize be chosen.

40

Classifying this task requires two one versus one classifications; left lane-change versus lanekeep and right lane-change versus lane-keep. The training accuracy for both these tasks arepresented in Table 16.

Table 16: Training accuracy for sets C.1 and C.2 for inferred environmental intent.

Set Accuracy [%]

C.1 97.8

C.2 98.0

The high training accuracies for this task is due to its simplicity and non-ambiguous data.The dependency between the sensor values for this task could as easily be set-up using a rule-based solution. Since it is more of determining states than complex dependencies and changingvalues.

Using these results will mostly help to reduce the FPR when detecting lateral actions. Eventhough this signal clearly can detect when a driver should change lane or not, it cannot determineexactly when the driver will change lane. Therefore should this information be used as an aidto handle misclassifications when detecting lateral actions. A good example of this is shown inFigure 29. Additional to the classification output from detecting lateral actions is the outputfrom inferred environmental intent. The outputs from the latter use � for predicting lane-keepand ◦ when predicting lane-change. It can be seen that when the inferred environmental intentdoes not predict a lane-change can several false positives be reduced. It will not remove themall but clearly helps to improve overall performance.

2512 2514 2516 2518 2520 2522 2524−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]


Figure 28: Combining inferred environmental intent and detecting lateral actions. SVM outputfor lateral actions are marked with × for lane-keep and • for lane-changes. Predicted lane-keepfor inferred environmental intent is marked with � and predicted lane-change with ◦.

It is during lane-keep actions that this information will be most useful. The majority ofmotorway driving with heavy vehicles involves this task and it is also here the majority of allfalse positives will occur. In Figure 29 is a typical lane-keep action shown in a scenario withoutvehichles nearby the truck. Here was the inferred environmental intent efficiently suppressingthe misclassifications.

41

923 923.5 924 924.5 925 925.5 926 926.5 927 927.5

0

0.2

0.4

0.6

0.8

1

Time [s]

LanePosition[%

]

Figure 29: Combining inferred environmental intent and detecting lateral actions. SVM outputfor lateral actions are marked with × for lane-keep and • for lane-changes. Predicted lane-keepfor inferred environmental intent is marked with � and predicted lane-change with ◦.

When classifying lateral actions using set A.3 (α = 0 and without turn signals), windowsize0.5 seconds, combined with the output from inferred environmental intent was the false positiverate decreased from 17.6% to 9.4%. Rates that could be further decreased when adopting othersuggested approaches for reducing FPR.

In Appendix F are four more examples of how inferred environmental intent helps to reducethe number of false positives.

42

5 Discussion

In this chapter are the results and difficulties during the project discussed.

5.1 Collection of data

When collecting data for this thesis it was initially intended to use data from the SwedenMichigan Naturalistic Field Operational Test (SeMiFOT) [43]. SeMiFOT was a project tocollect, store and analyze data from both heavy and ordinary vehicles with professional andnon-professional drivers. For this project Scania were participating with two trucks and ninedrivers. Additional to the standard set-up of sensors in Scania trucks were also video recordingsof the driver environment and scenario in-front of the vehicle recorded. There was also an eye-tracker to record drivers visual patterns. Unfortunately only parts of this data were availableat Scania and these parts did not include all the relevant sensors required for this thesis. Withfull access to this data could a statistically more proven study been performed, with the largeramount of naturalistic lane-changes. This database could also have been used to determine thepossibilities of using eye-trackers for detecting preparatory actions prior to a lane-change as aninitial investigation for future work.

The data collected for this thesis by arranging trips was nowhere close to the amount ofdata collected in SeMiFOT. This should be taken into consideration when looking at theseresults. Another important difference is that for this thesis, all data were collected without aconnected trailer. This allowed for more naturalistic lane-changes but the results may not berepresentative for situations when driving with a connected trailer.

The drivers for the field trips were told of the aim for this study. This should not affectthe characteristics of sensor data for lateral actions but it could affect results for lateral intentand inferred environmental intent. To reduce this risk were only lane-changes that had naturalcauses selected, e.g. a slow leading vehicle. This could be determined from the video recordingsof the collected data.

The limited amount of data also implies that the results obtained in this study can only beregarded as an indicator whether this approach is suitable for this type of problem. It cannotbe used as a guarantee that the algorithm always will have a 81,6% accuracy when classifyingbetween lane-keep and lane-change actions.

Even though there was a limited amount of data there was difficulties in processing it. Thetrips resulted in five large data logs that had to be analyzed together with video recordings tomanually find all lane-change and lane-keep samples to define their start and end times. Thisled to less time spent on the algorithm and development in order to better solve this problem.

5.2 Classification algorithm

The results presented in this study and in comparison to related research confirms that it is nottrivial to detect early stages of lane-changes nor predict them during motorway driving. Theapproach of using SVM for detecting patterns for lateral actions, lateral intents and inferredenvironmental intent shows potential but there are some drawbacks. This algorithm can effi-ciently find good decision surfaces for a set of data with appropriate parameters. The problemis to find suitable parameters for the optimization. Even with the limited amount of data avail-able did every computation for a new decision surface given a set of parameters approximatelytake 30 seconds. With the large amount of combinations, the time to test and confirm resultswas demanding. This indirectly led to that only one kernel and a relatively small range ofparameter values were tested. Therefore it cannot be guaranteed that a global maximum wasfound during the optimization. To reduce the computational times a more dedicated C-software

43

could be used, but the compact test environment offered when using Matlab was at the timeconsidered superior. This environment allowed to more easily handle the data logs, tune thealgorithm and present data using graphs.

The high rates of false positives when detecting lateral actions introduces problems whenusing a method that only outputs discrete values. Considering the results it might have beenbetter using an approach yielding a probabilistic output such as RVM.

A possible solution to add a probabilistic output when using SVM, could be to add anartificial neural network with an logistic activation function on top of the SVM output, seeFigure 30. This neural network would be fed with for example the five latest SVM outputs,yk, yk−1, ..., yk−4. By finding optimal weights, w, using model-training it is possible to determinean optimal threshold for when to alert cooperating ADAS. For this approach to be effective itrequires that several classifications can be made every second.

Bias

Output

w5

w4

w3

w2

w1 yk

yk-1

yk-2

yk-3

yk-4

∑ logsig

Figure 30: Artifical neural network with one layer and logistic output.

The time measurements in the results showed that with a precomputed decision surface andrelatively old computer could multiple classifications be made per second. However, it was notinvestigated if there was enough computational power to implement this type of solution intothe test vehicle. Regarding a real implementation it would also be necessary to add substantialadditional logic to the classification process. For this thesis was an approach suggested onhow to handle discontinuous data but there are other important events. For example, whenthe vehicle crosses a lane-marking should the lane-change be considered as completed, and thefollowing classifications should shortly be interrupted. This is because the movement towardsthe middle of the destination lane should not be misjudged as a new lane-change.

5.3 Sensors in the test vehicle

The results from the testing showed that the set-up of sensors in the test vehicle is adequatefor detecting lane-changes. However, there is still room for improvements, e.g. could betterrepresentations of sensor data be used for the classification process. The sensor that proved tobe most important for lateral actions was the LDW camera sensor and signal for lane position.As related research also have shown, this signal displays the most significant patterns duringlateral actions, as could be expected.

The test set-up of sensors were not adequate to detect driver preparatory actions neitherdirectly or indirectly. A statement declared prior to the implementation based on relatedresearch and later confirmed in the results. To approach this task it is required to implementeither an eye-tracker for tracking eye movements (the drivers’ gaze) or an head-tracker fordetecting head yaw movements, when the driver looks at the rear-view mirror. This is because

44

the most significant behavior pattern that drivers exhibit prior to lane-changes are more frequentand longer gazes to the rear-view mirrors. Of the two systems is the eye-tracker the preferablechoice. Partly because it reduces the risk of false positives that could occur for head-trackers,when the driver looks out the window rather than looking in the rear-view mirror. The otherreason is that implementing an eye-tracker will also aid in solving other driver related problems.For example, in relation to detecting and predicting driver actions and intents it is also necessaryto determine if the driver is distracted or drowsy. This is because distracted or drowsy drivermaneuvers can be misinterpreted as an active maneuver using the approach suggested in thisthesis.

The alternative to driver preparatory actions, inferred environmental intent, proved to be aninteresting approach. It can both be used to reduce the number of false positives when detectinglateral actions but it can also be used for the problem of detecting drowsy or distracted drivers.Because if the vehicle quickly moves towards the lane-marking when it should not, accordingto the situation, could this imply a distracted driver.

The approach for inferred environmental intent in this thesis is not fully reliable. This isbecause all assumptions are being made on leading vehicles and scenarios. A better systemwould need to create a digital map of the surroundings detecting all objects (in-front, behindand to the sides) nearby the truck, see Figure 31.

Figure 31: Required areas to cover with sensors for improved inferred environmental intent.

To further improve the performance of inferred environmental intent should advanced mapdata be used. This could be provided by ADAS Horizon systems showing optimal paths,inclinations of roads, number of lanes, distance to exit-ramp and many other [33].

A more advanced system for inferred environmental intent takes the driver further out of theequation and the solution approaches ideas of autonomous vehicles. However, this might be anappropriate first step. To gradually implement more and more advanced artificial intelligenceto support ADAS, until in the end when it can fully control the vehicle by itself.

45

6 Conclusions

The purpose of this study was to investigate how driver actions and intents manifest in sensormeasurements, during and prior to lane-changes, and how they can accurately be detected usingan algorithm. Previous work in this area and the results in this thesis further conclude thatdetecting and predicting lane-changes is not a trivial task. The complexity comes from the smalldifferences and complex dependencies in data for lane-keep and lane-change maneuvers. Reasonsthat supports the choice of using pattern recognition techniques to approach this problem.

Using support vector machine for the classification and detection of lateral actions showedpromising results. Conclusions from those are:

• The set-up of sensors in the test vehicle from Scania was adequate for detecting lateralactions, if adjustments were made to reduce the false positive rate.

• It is very important of how the sensor data is represented and fed to the algorithm.Representations that should highlight properties with significant patterns.

• Lane position was the feature with the most significant patterns during lateral actions.

• The small differences in data for lane-keep and lane-change actions, together with thesmooth transitions in data during lane-changes, is the main reason for the high rates offalse positives.

• The suggested solution cannot be used to determine if the driver was focused or distractedduring the maneuver. Since direct inputs such as steering wheel angle did not show enoughpredictability.

• A solution that only outputs two discrete values introduces problems when troubleshoot-ing the solution. Methods with probabilistic outputs could therefore yield more cues toimprove the design of the algorithm.

It was realized from related research and confirmed in the results that the set-up of sensors inthe test vehicle was not adequate for predicting lane-changes. Conclusions from related researchhave shown that it is necessary to install an eye-tracker (or the less preferable choice, a head-tracker) for detecting preparatory actions prior to a lane-change. Implementing an eye-trackerwill also help in the quest to detect drowsy or distracted drivers.

The approach of determining what the driver should prefer to do given a scenario yieldedpromising results and efficiently helped to reduce the number false positives. Looking at theproblem this way excludes the driver from the equation but it yields important information.Also, in the case with dangerous or bad drivers it might be better to look at the surroundingsrather than the driver.

Inferring driver actions and intents has a natural uncertainty to it and no solution will bebulletproof. Solving this problem requires to use and combine many cues from different sources;the driver, the vehicle and the environment.

46

7 Future work

For future work it is of interest to develop or use a better system for collecting data. It shouldbe able to mark events of interest in the CAN-data during the recordings, either manually or bybacktracking after detecting a lane crossing. The system should also be fully synchronized withmultiple video inputs for recording activity in the cabin and the context around the ego-vehicle.The system should be portal and expandable so it can be installed in different vehicles and usedfor different purposes.

It is also necessary to investigate how the prediction of lane-changes can be further improvedby using an eye-tracker (or head-tracker). A solution that according to related research shouldhelp to solve the problem of finding driver preparatory actions prior to a lane-change.

This thesis has also assumed that drivers are active and in control of the vehicle. Anassumption that cannot be guaranteed when implemented and tested in a real vehicle. Thereforeit is required to combine this system or at least receive inputs from a driver distraction systemto verify that the recognized action or predicted intent is actually intended.

It is also of interest to further explore the possibilities to detect more objects surroundingthe ego-vehicle. In combination with more sophisticated map data could an advanced version ofthe solution for inferred environmental intent be created, introducing more artificial intelligence.A reasonable step towards more intelligent vehicles and an autonomous future.

47

References

[1] S. Lee, E. Olsen, and W. Wierwille, A Comprehensive Examination of Naturalistic Lane-changes. National Highway Traffic Safety Administration, 2004.

[2] I. Ajzen, “The theory of planned behavior,” Organizational Behavior and Human DecisionProcesses, vol. 50, pp. 179–211, 1991.

[3] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Aca-demic Publishers Norwell, 1981.

[4] J. C. Bezdek, R. Krisnapuram, and N. R. Pal, Fuzzy models and algorithms for patternrecognition and image processing. Springer, first ed., 2005.

[5] S. Gagula-Palalic, Fuzzy clustering models and algorithms for pattern recognition. Interna-tional University of Sarajevo, 2008.

[6] C. W. Therrienl, Decision estimation and classification: an introduction to pattern recog-nition and related topics. John Wiley & Sons, Inc, 1989.

[7] S. Miyamoto, H. Ichihashi, and K. Honda, Algorithms for Fuzzy Clustering. Springer-VerlagBerlin Heidelberg 2008, 2008.

[8] S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, third ed., 2006.

[9] D. Salvucci, “Inferring driver intent: A case study in lane-change detection,” Proceedingsfor the Human Factors Ergonomics Society 48th Annual Meeting, 2004.

[10] S. Lefevre, J. Ibanez-Guzman, and C. Laugier, “Context-based estimation of driver intentat road intersections,” IEEE Symposium of Computational Intelligence in Vehicles andTransportation Systems, 2011.

[11] N.Oliver and A.Pentland, “Graphical models for driver behavior recognition in a smartcar,”Proceedings of the IEEE Intelligent Vehicles Symposium, 2000.

[12] A. Pentland and A. Liu, “Modeling and prediction of human behavior,” Neural Computa-tion, vol. 11, pp. 229–242, 1999.

[13] A. Pentland and A. Liu, “Towards real-time recognition of driver intention,” IEEE Intelli-gent Transportation Systems Conference, 1997.

[14] N.Kuge, T.Yamamura, O.Shimoyama, and A.Liu, “A driver behavior recognition methodbased on a driver model framework,” Nissan Motor Co., LTD, MIT, 2000-01-0349, 2011.

[15] H. Berndt, J. Emmert, and K. Dietmayer, “Continuous driver intention recognition with-hidden markov models,” Proceedings of the 11th International IEEE Conference on Intel-ligent Transportation Systems, 2008.

[16] D. Salvucci and H. Mandalia, “Using support vector machines for lane-change detection,”Unknown, vol. 1, 2004.

[17] H. Mandalia, “Pattern recognition techniques to infer driver intentions,” Technical reportDU-CS-04-08, 2004.

[18] R.Mabuchi and K.Yamada, “Study on driver-intent estimation at yellow traffic signal byusing driving simulator,” IEEE Intelligent Vehicles Symposium (IV), 2011.

48

[19] A.Doshi, B.T.Morris, and M.M.Trivedi, “Lane change intent prediction for driver assis-tance: On-road design and evaluation,” IEEE Intelligent Vehicles Symposium (IV), 2011.

[20] J. McCall, D. Wipf, M. Trivedi, and B. Rao, “Lane change intent analysis using robustoperators and sparse bayesian learning,” Tecnical report, San Diego La Jolla, Universityof California, 2004.

[21] M. Henning, O. Georgeon, T. Wynn, and J. Krems, “Modelling driver behavior in or-der to infer the intention of changing lanes.” http://www.conference.noehumanist.org/

articles/Proceedings-HUMANIST-S3.1.pdf.

[22] J. McCall and M. Trivedi, “Driver behavior and situation aware brake assitance for intelli-gent vehicles,” Department of Computer Science, National Taiwan University, Taipei 106,Taiwan, 2010.

[23] M. Pal, “Kernel methods in remote sensing: A review,” Proceeding of the Workshop onApplication of advanced soft computing Techniques in Geo-spatial Data Analysis, pp. 211–227, 2008.

[24] C. Cortes and V. Vapnik, “Support vector networks,” Machine Learning, pp. 273–297,1995.

[25] V. Blanz, V. Vapnik, and T. Vetter, “Comparison of view–based object recognition al-gorithms using realistic 3d models,” Artificial Neural Networks — ICANN’96, vol. 1112,pp. 251–256, 1995.

[26] C. J. Burges, “A tutorial on support vector machines for pattern recognition,” Data miningand knowledge discovery, vol. 2, pp. 121–167, 1998.

[27] M. E. Tipping, “The relevance vector machine,” Advances in Neural Information ProcessingSystems, vol. 12, 2000.

[28] D. Tzikas, L. Wei, A. Likas, Y. Yang, and N. Galatsanos, “A tutorial on relevance vectormachines for regression and classification with applications,” Department of Electrical andComputer Engineering, Illinois Institute of Technology, Chicago, USA.

[29] P. Blunsom, “Hidden markov models.” http://digital.cs.usu.edu/~cyan/CS7960/

hmm-tutorial.pdf, 2004.

[30] R. Dugad and U. Desai, “A tutorial on hidden markov models,” vol. Technical Report No.: SPANN-96.1, 1996.

[31] R. Sutton and A. Barto, Reinforcement learning: An introduction. MIT Press, Cambridge,MA, internet ed., 1998.

[32] L. Gunnarson, Korkortsboken. STR Service AB, seventh ed., 2003.

[33] C. Ress, D. Balzer, A. Bracht, S. Durekovic, and J. Lowenau, “Adasis protocol for advancedin-vehicle applications,” ADASIS Forum.

[34] L. Tijerina, W. R. Garrott, D. Stoltzfus, and E. Parmer, “Eye glance behavior of van andpassenger car drivers during lane change decision phase,” Transportation Research Record,vol. 1937, pp. 37–43, 2005.

49

http://www.conference.noehumanist.org/articles/Proceedings-HUMANIST-S3.1.pdf

http://www.conference.noehumanist.org/articles/Proceedings-HUMANIST-S3.1.pdf

http://digital.cs.usu.edu/~cyan/CS7960/hmm-tutorial.pdf

http://digital.cs.usu.edu/~cyan/CS7960/hmm-tutorial.pdf

[35] D. Salvucci, A. Liu, and E. Boer, “Control and monitoring during lane changes,” Visionin Vehicles 9, vol. 9, 2001.

[36] S. Haufe, M. Treder, M. Gugler, M. Sagebaum, G. Curio, and B. Blankertz, “Eeg poten-tials predict upcoming emergency brakings during simulated driving.,” Journal of NeuralEngineering, vol. 8, 2011.

[37] E. Tsang, D. Yeung, and P. Chan, “Fuzzy support vector machine for solving two-classproblems.,” Proceedings of the Second International Conference on Machine Learning andCybernetics, Xi’an., vol. 1, 2003.

[38] W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical recipes. CambridgeUniversity Press;, third ed., 2007.

[39] S. Gunn, “Support vector machines for matlab.” http://www.isis.ecs.soton.ac.uk/

resources/svminfo/.

[40] C. W. Hsu, C. C. Chang, and C. J. Lin, “A practical guide to support vector classification,”Department of Computer Science, National Taiwan University, Taipei 106, Taiwan, 2010.

[41] P. Sherrod, “Dtreg, software for predictive modeling and forecasting.” http://www.dtreg.

com/.

[42] T. Fawcett, “Roc graphs: Notes and practical considerations for researchers,” HP Labora-tories, MS 1143, 1501 Page Mill Road, Palo Alto, CA 94304, 2004.

[43] Chalmers and SAFER, “Sweden michigan naturalistic field operational test (semifot).”http://www.chalmers.se/safer/EN/projects/traffic-safety-analysis/semifot.

50

http://www.isis.ecs.soton.ac.uk/resources/svminfo/

http://www.isis.ecs.soton.ac.uk/resources/svminfo/

http://www.dtreg.com/

http://www.dtreg.com/

http://www.chalmers.se/safer/EN/projects/traffic-safety-analysis/semifot

A Support vector machine

The mathematical description of SVM in this appendix is to further explain its functionalityand advantages. This description is based on texts by Cortes and Vapnik [24], Press et al. [38]and Burges [26]. The author suggests those texts for further reading and more informationabout SVM.

A.1 Linear support vector machine

Mathematically a linear SVM can be described using a given set of training data (xi, yi) fori = 1, ..., n with yi ∈ {−1, 1} and xi ∈ Rp, yi indicates the set xi belongs to. In other words, yishows on which side of the decision rule the data point resides. For a given set of data points,there can be several hyperplanes that separates two classes of data, see Figure 9 in Section2.3.1. The goal is to find a hyperplane that maximizes the margin between the two classes. Themargin is defined as the sum of the distance from the closest data points of both classes to thehyperplane. A larger margin is desired because it reduce the problem of overfitting. Overfittingoccurs when the solution is too customized for the training data and not suitable for unknowntest data.

An arbitrary hyperplane that completely separates points of data into two classes is definedby the equation

f(x) ≡ w · x + b = 0 (A.1)

where w is the normal to the hyperplane, x set of data points and b gives the perpendiculardistance from the origin to the plane. Given an initial hyperplane, with training points yi = 1on one side of the hyperplane and yi = −1 on the other, the parameters w and b can be adjustedto create the following constraints,

xi ·w + b ≥ +1 when yi = +1 (A.2)

xi ·w + b ≤ −1 when yi = −1 (A.3)

In Figure 8 in Section 2.3.1 do these constraints represents the two parallel bounding hyperplanes(dotted lines). All points of data on these hyperplanes are the support vectors, hence the nameof the method [26]. The perpendicular distance between these two bounding hyperplanes is2||w|| . Summarized, both Equation 8 and Equation 9 can be combined to one inequality,

yi(xi ·w + b)− 1 ≥ 0 ∀i (A.4)

A.2 Quadratic programming

Quadratic programming is a mathematical optimization problem. The problem of maximizing orminimizing a quadratic function of several variables subject to linear constraints. The maximummargin for a SVM can be found by solving a particular problem, minimizing ||w||2 since theperpendicular distance between two bounding hyperplanes is 2

||w|| . [26]

A.2.1 Primal and dual problems in quadratic programming

For optimization problems with linear constraints it is often possible to convert the original(primal) problem to a dual problem. The dual problem is obtained by forming the Lagrangian,

51

using multipliers to add the constraints on the variables to the objective function. The generalprimal problem can be stated as,

Minimize: f(w)

Subject to: gj(w) ≤ 0

hk(w) = 0

(A.5)

with f(w) being quadratic in w with the linear constraints g(w) and h(w). The indexes i,jrepresents the sets of inequality and equality constraints. The dual form is then obtained byadding these multipliers to the objective function as seen in Equation A.6. [38]

L ≡ 1

2f(w) +

∑j

αjgj(w) +∑k

βkhk(w) (A.6)

With the goal to find an extrema (minimum or maximum) are the following conditions used,

∂L

∂wi= 0,

∂L

∂βk= 0 (A.7)

to algebraically eliminate w from L in favor for α and β. Using the strong duality and Kuhn-Tucker theorems is the dual problem equivalent to the primal but here,

Maximize: L(α,β)

Subject to: αj ≥ 0 for all j(A.8)

This gives that if the optimal solution of the primal problem is w then α, β are the opti-mal solutions for the dual problem. With the Karush-Kuhn-Tucker complementarity conditionαjgj(w) = 0 for all j that says that at least one of αj and gj(w) must be zero for each j. Thismeans that from the solution of the dual problem it is possibly to find inequality constraints inthe primal problem for all nonzero αj in the solution of the dual problem. [38]

A.2.2 Dual problem for SVM

In Equation A.4 there are only inequality constraints. Thus is a positive Lagrange multiplierintroduced for each of these constraints. The rule for constraints of the form gj(w) ≤ 0 are mul-tiplied with positive multipliers and subtracted from the objective function yielding EquationA.9. [38]

Lp =1

2w ·w +

∑i

αi[1− yi(w · xi + b)] (A.9)

For the primal problem is Lp minimized with respect to w and b, with the requirements that thederivatives of LP with respect to all αi vanish (αj > 0). This implies the following conditionsfor extrema,

0 =∂L

∂w= w−

∑i

αiyixi → w =∑i

αiyixi (A.10)

0 =∂L

∂b=

∑i

αiyi (A.11)

52

Since these are equality constraints in the dual formulation, they can be substituted into Equa-tion A.9 giving the dual form.

LD(α) =∑j

αj −1

2

∑j,k

αjyj(xj · xk)ykαk

≡ e ·α− 1

2α · diag(y) ·G · diag(y) ·α

(A.12)

In Equation A.12 e denotes vector components that are all unity, diag are diagonal matricesand G is the Gram matrix. The dual problem then turns into

Maximize:1

2α · diag(y) ·G · diag(y) ·α− e ·α

Subject to: αj > 0 for all i

α · y = 0

(A.13)

Equation A.10 show how the optimal solution w of the primal problem can be found from thesolution α. With the Karush-Kuhn-Tucker relation,

αi[yi(w · xi + b)− 1] = 0 (A.14)

can b be found for any nonzero αi. This yields the final decision hyperplane f(x) = w · x + b.[38]

A.2.3 Soft margin

If there is no hyperplane that can split positive and negative examples without any errors isthe soft margin method used to find a hyperplane that splits the examples as good as possible.This method adds a slack variable ξ that measures the misclassification. If the data points canbe separated cleanly is ξ = 0 otherwise ξ > 0. The penalty error is calculated by multiplyingthe distance to the hyperplane with the error cost. The modified inequality then becomes,

yi(w · xi − b) ≥ 1− ξi 1 ≤ i ≤ n (A.15)

The optimization then becomes a trade off between large margin or making ξi as small aspossible. This problem not only require optimization but also regularization. Regularizationintroduces additional information to to solve ill-posed problems or to prevent overfitting [38].The notation is A+ λB with A being the quadratic form and B is the regularization operatorfor ξ. The primal problem then becomes Equation A.16 with the cost function λ being variedbetween 0 < λ <∞.

Minimize: A+ λB =1

2w ·w + λ

∑i

ξi

Subject to: ξi ≥ 0,

yi(w · xi + b) ≥ 1− ξi i = 1, ...,m

(A.16)

The cost parameter λ controls the trade-off between a large margin or less generalization.Large margins is less accurate on training data but possibly more robust for new data. A smallmargin is accurate on training data but less generalized for new data [38]. If λ is close tozero it emphasizes large margins, i.e. less attention is payed for points violating the marginconstraint. However if λ is large there is a major penalty for points violating the marginconstraint potentially resulting in overfitting.

53

A.3 The kernel trick

With the implementation of a kernel K the mapping from a n-dimensional feature vector intohigher dimensional space is possible.

x→ ϕ(x) (A.17)

The advantage with this mapping is that non-linear surfaces in one dimension can be approxi-mated with equivalent linear ones in higher dimensions. A decision surface f(x) constructed inthe higher dimensional space (embedded space) can then include all linear and quadratic formsthat was in the original feature space,

f(x) = F [ϕ(x)] ≡ w · ϕ(x) + b (A.18)

Finding optimal w and b in embedded space for the primal problem then becomes,

Minimize:1

2w ·w + λ

∑i

Ξi

Subject to: Ξi ≥ 0

yi(w · ϕ(xi) + b) ≥ 1− Ξi i = 1, ...,m

(A.19)

If this mapping occurs up thousands of dimensional space for the primal problem it would in-troduce computational complexities. However, in the dual form the resulting equation becomes,

Minimize:1

2α · diag(y) ·K · diag(y) ·α− e ·α

Subject to: 0 ≤ αj ≤ λ for all i

α · y = 0

(A.20)

The same equation as earlier but with the kernel, K, instead of the Gram matrix Gi,j . The trickwith kernels are that it is not necessary to know the mapping ϕ(x) at all. All that is needed isa computation of Ki,j that could have come from a mapping ϕ(x). In other words, a matrix ofsize m ×m with the mathematical properties of an inner product space in higher dimensions.[38] One possible kernel is the Gram matrix Gi,j . Other properties of kernels are is that thekernel must be symmetric and have nonnegative eigenvalues according to Mercer’s theorem.Any combination of kernels is a kernel function (multiplications, additions and scaling). [38]When a kernel have been decided the decision rule for any new feature vector x is

f(x) =∑i

αiyiK(xi,x) + b (A.21)

with b,

b =

∑i αi(λ− αi)[yi −

∑j αiyjK(xj ,xi)]∑

i αi(λ− αi)(A.22)

A.3.1 Selecting kernel

Finding a kernel for a supervised classification problem is a difficult task. Often are some ofthe simpler kernels good enough for many problems. A few common examples are:

54

• Linear: K(xi,xj) = xi · xi

• Power: K(xi,xj) = (xi · xi)d

• Polynomial: K(xi,xj) = (axi · xi + b)d

• Sigmoid: K(xi,xj) = tanh(axi · xj + b)

• Gaussian radial basis function: K(xi,xj) = e−|xi−xi|

2

2σ2

The Gaussian radial basis functions kernel is popular with only one model parameter σ and thecost parameter λ if required.

55

B Test vehicle

Table 17: Specifications for the test vehicle.

Tractor Scania R440LA4x2MNA

Wheel configuration 4x2

Engine torque 2300 Nm (at 1000-1300 rpm)

Engine power 440 hp

Gearbox Opticruise

Width 243 cm

Wheelbase 370 cm

Speed limit 89 km/h

Environmental sensor systems LDW, ACC

Figure 32: The test vehicle used for collecting data.

56

C Flowchart for inferred environmental intent

This flowchart shows the required decisions to make for the driver based on environmental datawhen driving on motorways. The example use a motorway with three lanes and lane position1 denotes the inner-lane. In the flowchart LK denotes lane-keep, LCR is right lane-change andLCL is left lane-change.

Star

t

Typ

e o

f R

oad

?

LCL

Entr

ance

ram

p

Veh

icle

In-f

ron

t?

Mo

torw

ay

Nu

mb

er o

f La

nes

?N

oEn

ter

Exit

-ram

p?

1

LCR

Yes

LK No

Cu

rren

t La

ne?

>1

Veh

icle

on

En

tran

ce-r

amp

?1

Ente

r Ex

it-r

amp

?N

o

LCR

Yes

LK No

Left

ad

jace

nt

lan

e av

aila

ble

?

Yes

LCL

Yes

LKNo

Exit

-ram

p f

or

pat

h <

15

00

m?

>1 LCR

Yes

Rel

. Sp

eed

in

righ

t la

ne?

No

LCR

V >

= D

esir

ed

LK

V <

Des

ired

Nu

mb

er o

f La

nes

?Ye

sC

urr

ent

Lan

e?>1

Ente

r Ex

it-r

amp

?

1

LKN

o

LCR

Yes

Exit

-ram

p f

or

pat

h <

15

00

m?

3LCR

Yes

Spee

d o

f le

adin

g ve

hic

le?

No

LK

V <

Des

ired

Rel

. Sp

eed

in

righ

t la

ne?

LCR

V >

= D

esir

ed

LK

V <

Des

ired

V >

= D

esir

ed

Exit

-ram

p f

or

pat

h <

15

00

m?

2

LCR

Yes

Spee

d o

f le

adin

g ve

hic

le?

No

>= D

esir

edR

el. S

pee

d in

ri

ght

lan

e?

LCR

V >

= D

esir

ed

LK

V <

Des

ired

Rel

. Sp

eed

in le

ft

lan

e?

V <

Des

ired

LK<

Des

ired

LCL

V >

= D

esir

ed

Spee

d o

f le

adin

g ve

hic

le?

1

Exit

-ram

p f

or

pat

h <

15

00

m?

V <

Des

ired

LKYe

s

Rel

. Sp

eed

in le

ft

lan

e?

No

LCL

V >

= D

esir

ed

LK

V <

Des

ired

V >

= D

esir

ed

Veh

icle

on

En

tran

ce-r

amp

?N

oLK

Left

ad

jace

nt

lan

e av

aila

ble

?

Yes

LCL

Yes

LKNo

Left

ad

jace

nt

lan

e av

aila

ble

?

Yes

LKN

o

Ente

r Ex

it-r

amp

?

No

LCR

Yes

Lan

e b

eco

min

g ex

it r

amp

?

No

Ente

r Ex

it-r

amp

?Ye

sLK

Yes

LCL

No

V >

= D

esir

ed

Figure 33: Flowchart suggesting appropriate action given a specific scenario.

57

D Matlab scripts

This appendix includes the two main scripts for the gridsearch and plotting ROC graphs. Thescripts have been stripped down to show the most essential and should be regarded as semipseudo-code.

D.1 Main script for gridsearch

%% Gridsearch to f i nd model− and costparameter f o r SVM%Author : S ta f f an Bengtssonc l e a r a l l ;c l c ;format shortG

g l oba l p1

%Var iab l e s f o r Train ing datablock = 2 ;shift = 0 . 1 ;CRange = [1 e−1 1e2 1e5 1e8 1e11 1e14 1e17 ] ; %Cost parametersp1Range = [1 e−3 1e−2 1e−1 1e0 1e1 1e2 1e3 ] ; %Model parameterswindows = [ 0 . 7 5 1 .0 1 .25 1 .5 2 .0 2 .5 3 .0 3 . 5 ] ; %Windowsizesker = ' rb f ' ; %Choice o f kerne l , r ad ia ba s i s f unc t i on

f p r i n t f ( 'Grid search commenced . . \ n ' ) ;f o r g = 1 : s i z e ( windows , 2 ) %Loop through a l l Windowsizes

windowSize = windows ( g ) ; %Get windowsize

%Get maneuver l i s t f o r lane−keep and lane−change samplesmL = getManeuverList ( ) ;

%Get the d i f f e r e n t s enso r s e t ssL = getSensorSets ( ) ;

f o r v = 1 : s i z e ( sL , 1 ) %Loop through the d i f f e r e n t s enso r s e t sfeatures = sL{v , 2 } ; %Get a l l s en s o r s from the s e t

%Create vec to r f o r t r a i n i n g data from maneuver l i s t and senso r s e t[ trX trY ] = createTrainingDataVector ( mL , block , windowSize , shift , features ) ;

f o r i = 1 : s i z e ( CRange , 2 ) %Loop through C−va lue sC = CRange (1 , i ) ; %Get C valuef o r j = 1 : s i z e ( p1Range , 2 ) %Loop through p1−va lue s

p1 = p1Range (1 , j ) ; %Get p1 valuef o r k = 1:4 %Loop through cros s−va l i d a t i o n

%Se l e c t t r a i n i n g and t e s t data f o r c r o s s v a l i d a t i o n[ trX_k trY_k tsX_k tsY_k ] = crossValid ( trX_s , trY_s , k ) ;%Create c l a s s i f i e r from t r a i n i n g data us ing func t i on by%Steve Gunn .[ nsv , alpha , bias ] = svc ( trX_k , trY_k , ker , C , prnt ) ;%Cross v a l i d a t e accuracy us ing func t i on by Steve Gunn .[ h ] = svcinfo ( trX_k , trY_k , tsX_k , tsY_k , ker , alpha , bias , prnt ) ;%Save accuracy f o r t h i s k c ros s−va l i d a t i o nkAcc = [ kAcc h ] ;

endend%Check and save i f l a t e s t a c cu r a c i e s are g l oba l high[ result ] = saveHighAccParam ( kAcc , p1 , C ) ;

endend%Save r e s u l t s to f i l esave ( resultsFile , ' r e s u l t ' ) ;

end

58

D.2 Main script for ROC plot

%% ROC plo t f o r c l a s s i f i e r performance%Author : S ta f f an Bengtssonc l e a r a l l ;c l c ;format shortG

g l oba l p1

%Var iab l e sblock = 2 ; %Number o f b locksshiftTs = 0 . 1 ; %S h i f t s i z e f o r each new c l a s s i f i c a t i o nker = ' rb f ' ; %Choice o f k e rne l

%Get optimal parameters from the g r i d s ea r ch[ windows Cvalues p1values ] = getGridValues ( ) ;

f p r i n t f ( ' Star t p l o t t i n g o f ROC graph . . \ n ' ) ;f o r i = 1 : s i z e ( windows , 2 ) %Calcu la te ROC r e s u l t s f o r every s e t and window

%Get optimal parameters f o r cur r ent s e t and window .windowSize = windows (1 , i ) ;C = Cvalues (1 , i ) ;p1 = p1values (1 , i ) ;

%Create vec to r f o r t r a i n i n g and t e s t data[ trX_trn trY_trn ] = createTrainDataVector (i , block , windowSize , shiftTr ) ;[ trX_tst trY_tst ] = createTestDataVector (i , block , windowSize , shiftTs ) ;

%Load svmConfig f i l e to get Lagrange mu l t i p l i e r sload ( ' svmConfig . mat ' ) ;nsv = svmConfig{i , 1 } ;alpha = svmConfig{i , 2 } ;bias = svmConfig{i , 3 } ;

%Make c l a s s i f i c a t i o n s f o r t e s t datapredictedY = svcoutput ( trX_trn , trY_trn , trX_tst , ker , alpha , bias , SM ) ;

%Compare ac tua l c l a s s with pred i c t ed c l a s s e s to enumerate TP,FP,FN and TNf o r p = 1 : s i z e ( predictedY )

i f ( predictedY (p , 1 ) == trY_tst (p , 1 ) ) && ( trY_tst (p , 1 ) == 1)TP = TP+1;

e l s e i f ( predictedY (p , 1 ) == trY_tst (p , 1 ) ) && ( trY_tst (p , 1 ) == −1)TN = TN+1;

e l s e i f ( predictedY (p , 1 ) ˜= trY_tst (p , 1 ) ) && ( trY_tst (p , 1 ) == 1)FN = FN+1;

e l s e i f ( predictedY (p , 1 ) ˜= trY_tst (p , 1 ) ) && ( trY_tst (p , 1 ) == −1)FP = FP+1;

endend

%Calcu la te t rue p o s i t i v e ra t e ( x ) , f a l s e p o s i t i v e ra t e ( y ) ,%p r e c i s i o n and accuracy .x = [ x FP /( FP+TN ) ] ; %Fal se p o s i t i v e ra t ey = [ y TP /( TP+FN ) ] ; %True p o s i t i v e ra t eprecision = TP /( TP+FP ) ;accuracy = ( TP+TN ) /( TP+FP+TN+FN ) ;

%Check i f accuracy and p r e c i s i o n are g l oba l high . Save po s i t i o n in vec to r[ posA posP acc prec ] = saveHighestAccPrec ( accuracy , precision ) ;

end

%Plot ROC graph , inputs TPR, FPR, po s i t i o n f o r accuracy and p r e c i s i o nplotROCgraph (x , y , posA , posP ) ;

59

E Plots: Classification of lateral actions

This appendix shows the output from classifiers for lateral actions using 100 ms steps. In theseplots is the lane position plotted as a function of time. Outputs for lane-keep are marked with ×and • is used for lane-change. The caption for each figure declares the used set and windowsize.

2705 2705.5 2706 2706.5 2707 2707.5−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]


Figure 34: Set A.3 with γ = 0.5. Time until detection: 0.3 s.

1710 1710.5 1711 1711.5 1712 1712.5 1713 1713.5−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]


Figure 35: Set A.3 with γ = 0.5. Time until detection: 0.5 s.

60

1025.8 1026 1026.21026.41026.61026.8 1027 1027.21027.41027.61027.8−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]

Lane Position

Figure 36: Set A.3 with γ = 0.5. Problems with misclassifications for lane-keep actions.

1141 1141.5 1142 1142.5 1143 1143.5 1144−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]

Lane Position

Figure 37: Set A.3 with γ = 0.5. Problems with misclassifications for lane-keep actions.

61

F Plots: Classification of inferred environmental intent

This appendix shows the output from classifiers for both lateral actions and inferred environ-mental intent. In these plots is the lane position plotted as a function of time. For lateralactions are lane-keep outputs marked with × and • are used for lane-changes. For inferredenvironmental intent is predicted lane-keep marked with � while lane-change is marked with ◦.

1141 1141.5 1142 1142.5 1143 1143.5 1144−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]

Lane Position

Figure 38: Set A.3 with γ = 0.5. Inferred environmental intent helps to suppress seven misclas-sifications. 100ms between each classification.

2005 2005.5 2006 2006.5 2007 2007.5

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]

Lane Position

Figure 39: Set A.3 with γ = 0.5. Inferred environmental intent helps to suppress two misclas-sifications. 100ms between each classification.

62

1749 1750 1751 1752 1753 1754 1755 1756 1757−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]


Figure 40: Set A.3 with γ = 0.5. Inferred environmental intent helps to suppress two misclas-sifications. 250ms between each classification.

1724 1725 1726 1727 1728 1729 1730 1731 1732−0.2

0

0.2

0.4

0.6

0.8

1

Time [s]

Lan

ePosition[%

]


Figure 41: Set A.3 with γ = 0.5. Inferred environmental intent too early predicts lane-changeand miss to suppress two misclassifications. 250ms between each classification.

63

detection and prediction of lane -chan ges : a study to ...€¦ · detection and prediction of...

Documents