optimization and management of cyber-physical systems: smart …rx... · 2019-02-13 · abstract of...

Optimization and Management of Cyber-Physical Systems - Smart

Grid and Plug-in Hybrid Electric Vehicles

A Dissertation Presented

by

Bingnan Jiang

to

The Department of Electrical and Computer Engineering

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in

Computer Engineering

Northeastern University

Boston, Massachusetts

August 2015

To my family.

ii

Contents

List of Figures v

List of Tables vii

Acknowledgments viii

Abstract of the Dissertation ix

1 Introduction 11.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Optimal Energy Management in smart microgrid 62.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Design of Distributed DR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3.1 DR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.2 Optimization problem formation . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 Shared Cost-led µCHPs Management . . . . . . . . . . . . . . . . . . . . . . . . 152.4.1 µCHP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.2 Shared Cost-led µCHPs Management Strategy . . . . . . . . . . . . . . . 15

2.5 VRB Discharging Management with Q-Learning . . . . . . . . . . . . . . . . . . 182.6 Problem-solving Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.7 Bill Balancing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.8 Simulation and Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.8.1 Simulation Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.8.2 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.8.3 Distributed DR Results and Analysis . . . . . . . . . . . . . . . . . . . . 26

3 Vehicle-to-Grid Reactive Power Compensation 313.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.2 V2G System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.2.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.2.2 Model of On-board Charger . . . . . . . . . . . . . . . . . . . . . . . . . 34

iii

3.3 Multi-objective Optimization Formulation . . . . . . . . . . . . . . . . . . . . . . 353.3.1 Optimizing PEV Agent Benefits . . . . . . . . . . . . . . . . . . . . . . . 353.3.2 Optimizing Utility Grid Reactive Power Compensation . . . . . . . . . . . 373.3.3 Multi-Objective Optimization Formulation . . . . . . . . . . . . . . . . . 38

3.4 Multi-Objective Optimization Solution Approach . . . . . . . . . . . . . . . . . . 383.4.1 Problem linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.4.2 Normalized Normal Constraint Method . . . . . . . . . . . . . . . . . . . 403.4.3 Decentralized Algorithm Based on Lagrangian Decomposition . . . . . . . 40

3.5 Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 On-road PHEV Power Management in Vehicular Networks 504.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.2 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2.1 Overview of the System . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.2.2 System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.3 HIERARCHICAL POWER MANAGEMENT ALGORITHMS AND SOLUTIONS 564.3.1 Hierarchical Power Management Algorithms . . . . . . . . . . . . . . . . 564.3.2 Optimization Formulation and Solutions . . . . . . . . . . . . . . . . . . . 58

4.4 RESULTS AND ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5 Traffic and Vehicle Speed Prediction in Vehicular Networks 725.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.2 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.3 Vehicle Speed Prediction System Design . . . . . . . . . . . . . . . . . . . . . . . 74

5.3.1 Traffic Speed Prediction with NN . . . . . . . . . . . . . . . . . . . . . . 755.3.2 Vehicle Speed Prediction with HMM . . . . . . . . . . . . . . . . . . . . 77

5.4 Road Network and Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . 805.5 Result and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6 Conclusion and Future Research 88

Bibliography 91

iv

List of Figures

2.1 Energy ecosystem in smart microgrid . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Scheme of the microgrid for a community . . . . . . . . . . . . . . . . . . . . . . 92.3 Microgrid management based on hierarchical optimization and bill balance . . . . 102.4 Model of Dynamic DR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 µCHP electric energy flow among contributors and beneficiaries in the microgrid . 222.6 Wind turbine power generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.7 Utility electricity price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.8 Cost comparison between our energy ecosystem and the conventional system . . . 262.9 Electricity consumption cost of one sample house in each day . . . . . . . . . . . . 272.10 Satisfaction degree of one house in each day . . . . . . . . . . . . . . . . . . . . . 272.11 Electric and thermal load demand of the community in the evaluated day . . . . . . 282.12 Total µCHPs and heat pumps generation in the community in the evaluated day . . 282.13 Hot water tank temperature in the evaluated house . . . . . . . . . . . . . . . . . . 292.14 Energy consumption cost of the community with µCHP system generation . . . . . 292.15 Energy consumption cost of the community with VRB discharging . . . . . . . . . 30

3.1 Electrical and geographical map layers of the V2G reactive power compensationsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Charger operation mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.3 Solution approach diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4 Framework of the decentralized optimization with Lagrangian relaxation . . . . . . 423.5 Simulation case setup for a distribution feeder and locations of charging stations . . 443.6 Iterations of solving an anchor point in case 2 . . . . . . . . . . . . . . . . . . . . 463.7 Pareto optimal points solved in cases 2, 3, and 6 . . . . . . . . . . . . . . . . . . . 463.8 Duality gaps of Pareto optimal points in cases 2, 3, and 6 . . . . . . . . . . . . . . 473.9 Average scheduled PEV unit cost per convenience of Pareto points in 3 study cases 483.10 Total PEV drop penalty of Pareto points in 3 study cases . . . . . . . . . . . . . . 483.11 Average power loss ratios of three charging schemes in the 7 test cases . . . . . . 49

4.1 Scheme of on-road PHEV power management system . . . . . . . . . . . . . . . . 524.2 A PHEV model with PSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.3 Unit cycle models for urban roads and freeway . . . . . . . . . . . . . . . . . . . 564.4 PHEV hierarchical mode for PHEV power management . . . . . . . . . . . . . . . 57

v

4.5 Diagram of stochastic programming for online PHEV power management . . . . . 594.6 UDDS driving cycle and a sample of generated stochastic driving cycle in spatial

domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.7 Speed transition probabilities for urban roads with light traffic . . . . . . . . . . . 644.8 MDP policy maps for an urban unit cycle with length index il = 2, stage index

k = 15, light traffic, and remaining battery budget 0.015 kWh . . . . . . . . . . . 664.9 Expected fuel consumption of an urban unit cycle (Length index il = 2) with MDP

and CDCS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.10 Required torque on the final drive shaft . . . . . . . . . . . . . . . . . . . . . . . . 684.11 ICE and EM torque output and fuel consumption along distance in MSQP/MDP and

CDCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.12 Battery SOC along driving distance in a sample test driving cycle . . . . . . . . . . 704.13 Operation points on ICE efficiency map . . . . . . . . . . . . . . . . . . . . . . . 704.14 Fuel consumption comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.1 Scheme of the 2-level vehicle speed prediction system . . . . . . . . . . . . . . . . 745.2 Diagram of a NN neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.3 NARX NN Model for Traffic Speed Prediction . . . . . . . . . . . . . . . . . . . 765.4 Left-to-right HMM for vehicle speed prediction . . . . . . . . . . . . . . . . . . . 785.5 Luxembourg road network in SUMO . . . . . . . . . . . . . . . . . . . . . . . . . 805.6 Procedure of data preparation for traffic prediction based on simulation . . . . . . . 815.7 Road set for prediction in Luxembourg motorway network . . . . . . . . . . . . . 825.8 Traffic prediction result for road segment #7 with one prediction period ahead . . . 825.9 Traffic speed prediction RMSE of all road segments . . . . . . . . . . . . . . . . . 835.10 AIC and BIC values for HMMs with different (Q, M) configurations . . . . . . . . 845.11 Comparison between HMM sampling and simulation observation for one road segment 855.12 Vehicle speed prediction RMSE of TSAP, NN/KDE and NN/HMM (∆k = 1) . . . 855.13 Vehicle speed prediction MAPE of NN, NN/KDE and NN/HMM (∆k = 1) . . . . 865.14 Vehicle speed prediction RMSE of NN/KDE and NN/HMM with different ∆k . . . 865.15 Histogram and pdf of vehicle speed prediction absolute error for road segment #5 . 87

vi

List of Tables

3.1 Parking interval and station capacity configurations for different cases . . . . . . . 45

4.1 Configuration of PHEV Powertrain with PSD . . . . . . . . . . . . . . . . . . . . 62

vii

Acknowledgments

First and foremost, I would like to thank my advisor, Prof. Yunsi Fei, for her support onmy research. Her excellent insights, guidance, and advice help me to strengthen my creativity, traincritical thinking ability, and stay on the right track throughout my PhD study. I have also learned alot of skills from her about paper writing and presentation. I would also like to thank my committeemembers, Prof. Waleed Meleis, Prof. Edmund Yeh, and Prof. Ningfang Mi, for their great advice onmy research proposal and dissertation. I am also thankful to Prof. Chee-Wooi Ten from MichiganTech University for his valuable advice on my research.

I would particularly thank my family for their endless love and support. They always giveme the confidence and courage to face up difficulties in this long journey. I cannot go through thiswithout their support.

Thank my lab mates for their help on my research and in my daily life. Working with youis enjoyable and will be a good memory. I also want to thank all my friends who bring funs to mylife and take away all my weariness.

viii

Abstract of the Dissertation

Optimization and Management of Cyber-Physical Systems - Smart Grid

and Plug-in Hybrid Electric Vehicles

by

Bingnan Jiang

Doctor of Philosophy in Computer Engineering

Northeastern University, August 2015

Dr. Yunsi Fei, Adviser

In cyber-physical systems (CPS), the bi-directional link between computational and physi-cal elements can significantly increase the efficiency, reliability, and cost-effectiveness of CPS. Aprecursor generation of CPS can be found in diverse applications, where smart gird and plug-inhybrid electric vehicles (PHEVs) are two exemplary vibrant applications. Compared with traditionalpower distribution systems and gasoline fueled vehicles, smart grid and PHEVs have much lowercost, higher service provision, and make the environment greener. However, it is challenging tomanage the operations of CPS optimally, in view of system complexity, interaction between cyberand physical components and the environment, limited computation resources, and high real-timeperformance requirement.

My dissertation has been focused on the optimization and prediction model design forcost-effective and energy-efficient CPS – smart grid and PHEVs. First, a novel cost-effective energyecosystem is proposed for a residential microgrid with renewable energy resources. It effectivelycoordinates demand response (DR), distributed generations (DGs), and energy storage managementthrough a three-level hierarchical optimization, in which particle swarm optimization (PSO) algorithmand environment-adaptive Q-learning algorithm are applied. Second, I explore the application ofmodern vehicle-to-grid (V2G) technologies on smart grid reactive power compensation. On-boardchargers of plug-in electric vehicles (PEVs) are proposed to be utilized as mobile volt-ampere reactive(VAR) resources. Third, an on-road PHEV power management system is proposed which utilizesthe information of stochastic vehicle driving states and real-time traffic conditions. With thesestochastic elements incorporated, a two-level hierarchical optimization model is developed basedon multi-stage stochastic quadratic programming (MSQP) and Markov decision process (MDP).The proposed system makes optimal on-road power management decisions and simulation results

ix

demonstrate its performance superior to existing methods in terms of fuel saving. Finally, a novelvehicle speed prediction algorithm is proposed in the context of vehicular networks. Vehicle speedprediction servers as important input to many vehicle applications, e.g., power management. A noveldata-driven vehicle speed prediction framework is proposed with the integration of neural network(NN) and hidden Markov models (HMMs). Prediction accuracy is improved in the proposed methodcompared with existing ones.

x

Chapter 1

Introduction

As more electricity-consuming products come into daily lives, e.g., electric vehicles (EVs)

and advanced HVAC systems, load demand is increasing dramatically and imposing new challenges

on existing power grid. Smart Grid, integrated with renewable energy generation, advanced metering

infrastructure, and information technologies, can cope with the impending global energy crisis and

environment deterioration. With great technological advance, the rapid developing plug-in electric

vehicle (PEVs) and plug-in hybrid electric vehicles (PHEVs) are taking place of traditional gasoline-

fueled vehicles for both cost and emission reduction. Both smart grid and PHEVs are exemplary

cyber-physical systems (CPS), where the close interaction between cyber and physical elements can

significantly improve the system efficiency, reliability, and cost. However, managing and optimizing

these CPS so as to take their full advantage is a challenging issue, due to the system complexity,

dynamics, environment uncertainty, limited resources, and high real-time performance requirement.

Smart grid is featured with renewable energy and distributed generation (DG), from which

cheaper and cleaner energy are supplied to users. However, their expensive infrastructure investment,

like the cost of wind turbines and high capacity batteries, would be one of major obstacles preventing

their popularization in ordinary households. One solution is to build a microgrid where energy

facilities are shared by the whole community with significant infrastructure cost reduction for each

household. In a microgrid, the load demand can be scheduled by demand response (DR) to increase

energy utilization efficiency[1]. With the price profile known, some load is shifted to off-peak hours

to reduce the energy consumption cost. Besides, distributed energy resources (DER), including

DG and energy storage system, should be optimally managed to minimize the energy generation

cost. Most existing works optimize DR and DER management separately, since it is challenging

to integrate them in optimization models on account of system complexities, i.e., different roles,

1

CHAPTER 1. INTRODUCTION

decisions, and large number of control variables. Extra cost reduction can be achieved if DR and

DER management is well coordinated. Another difficulty is to integrate stochastic elements, e.g.,

stochastic wind power and load demand, into optimization models, since their accurate mathematical

models are usually hard to build. To coordinate DR and DER efficiently in the energy ecosystem, a

new hierarchical optimization model in a multi-agent system is designed in this dissertation.

In addition to real power management, reactive power compensation is also a major concern

for energy-efficient and reliable smart grid, especially with the increasing load demand and DG

penetration. PEVs can be plugged into dedicated sockets for charging [2] and provide auxiliary

vehicle-to-grid (V2G) support simultaneously through their on-board bidirectional chargers [3, 4].

Reactive power compensation targets at power loss reduction, voltage regulation, power faction

correction, etc.[5, 6]. Reactive power compensation is traditionally provided during distributed

generations (DGs) [5] and by static volt-ampere reactive (VAR) compensators [7]. However, these

methods are limited to the fixed capacities and locations of reactive power resources. Recent research

results show that reactive power compensation from PEV on-board chargers do not affect their

battery’s lifetime [8]. In this dissertation, PEV on-board chargers are utilized as mobile VAR

resources to enhance the reactive power compensation for smart grid.

PHEV power management system is another CPS studied in my dissertation. A PHEV’s

powertrain is usually designed in a series mode, parallel mode, or series-parallel mode with power-

split devices (PSDs). In series PHEVs, torques from internal combustion engines (ICEs) are applied

to generators to generate electricity which is then supplied to electric motors (EMs) to generate

traction torques and drive vehicles. For parallel PHEVs, traction torques are generated by both ICEs

and EMs for long-distance driving. Many modern PHEVs, such as Toyota Prius PHEV, are designed

with PSDs to further increase energy efficiency. PSD introduces an extra control freedom for the

powertrain, i.e., ICE speed, so power decisions can be made more flexible and optimal according

to specific driving states. With different operation costs and efficiency characteristics, ICE and EM

are usually controlled together to achieve minimum fuel, electricity, or hybrid energy consumption.

Existing PHEV power management methods can be categorized into offline and online. Offline

management is usually formulated as an optimization problem based on historical driving cycles with

the assumption that future driving routes are known. This is easy for problem formulation but the

possible large difference between assumed future driving cycles and real ones will significantly affect

the management performance. Differently, online management makes power generation decision

at real-time, which adapts to instantaneous driving states. Constrained by the limited onboard

computation resources, online management algorithms are usually designed with low complexity,

2


like power balancing strategies, without utilizing trip information. Thus, online decisions are usually

not optimal for the entire trip. To take advantages of both online and offline management, a hybrid

power management system is designed to improve the system performance. It utilizes not only

historical driving cycles, but also real-time driving states, trip information, and traffic conditions.

Vehicle speed prediction serves as an important input for many vehicle specific applications

such as PHEV power management. Accurate vehicle speed prediction is challenging and needs to

incorporate many internal and external elements into prediction models, such as the vehicles type,

road types, and driving conditions. Traditionally, traffic and vehicle speed data are collected by

loop detectors and dedicated on-board equipment. These data collection equipment can hardly be

deployed densely in a road network for vehicle speed prediction due to their high cost. In the context

of vehicular network, more data traffic and driving data can be obtained from additional sources and

easily shared between vehicles and remote data center. Thus, facilitated by the new infrastructures

and enriched data, new data-driven algorithms can be designed to improve the accuracy of vehicle

speed prediction.

1.1 Contributions

This dissertation focuses on the optimal management and prediction system design for

cost-effective, energy-efficient, and reliable smart grid and PHEV CPS. The main contributions of

this dissertation are as follows:

• A novel cost-effective energy ecosystem in smart microgrid is proposed with a three-level

hierarchical optimization. The hierarchical optimization coordinates DR and DER management

and reduces the computational complexity. Interaction between users and DR is enhanced

by adopting users’ feedback. DR agents make decisions adaptable to user’s preference

change. Instead of optimizing each individual µCHP generation, all µCHPs are optimized

cooperatively for the whole community in a shared cost-led mode. An environment-adaptive

battery discharging management algorithm is designed based on Q-learning. It considers

stochastic elements in the microgrid and gives an optimal discharging policy.

• A V2G system is proposed to compensate reactive power for the grid during PEVs’ parking

and charging. The novelty of the proposed system lies in the utilization of PEV on-board

chargers as flexible and distributed VAR resources. PEV charging and parking are scheduled to

maximize benefits of both PEV owers and the utility grid. The scheduling is then formulated as

3


a multi-objective mixed integer nonlinear programming. The Normalized Normal Constraint

(NNC) method [9] is used to transform the multi-objective optimization to a set of single-

objective optimizations, each of which is solved to obtain a Pareto optimal solution (Pareto

point). Since the transformed single-objective optimization problems are nonlinear, they

are linearized into mixed integer linear programming (MILP) problems for efficient solving.

To make the solution approach scalable as the number of PEVs increases, a decentralized

algorithm is designed based on Lagrangian relaxation and decomposition.

• An on-road PHEV power management cyber-physical system is proposed in the context of

vehicular network. The objective of the proposed system is to minimize the fuel consumption

of a PHEV in a trip. The main contribution of this work is the design of a novel two-level

stochastic hierarchical power management system which utilizes vehicle real-time driving data,

vehicle speed prediction information, and historical driving cycles. The power management

consists of two steps, a high-level online battery budget allocation and a low-level offline

power policy generation. Decisions from the two-level optimizations are combined for PHEV

power management, which is optimal for individual driving trips.

• A vehicle speed prediction algorithm is proposed in vehicular networks. The objective is to

accurately predict individual vehicle speeds considering the the effects of traffic conditions,

road types, and driving behaviors. Its main contribution is to build a statistical model that

captures the relationship between traffic conditions and vehicle speeds for on-road vehicle

speed prediction. The novelty of the prediction model lies in the consideration of unobservable

driving states. The first-level neural network (NN) models predict traffic speed of road segments

according to historical traffic data. The traffic speed prediction result will serve as input to

the second-level model. In the second level, the statistical relationship between the individual

vehicle speed and traffic speed is modeled by hidden Markov models (HMMs), which are

trained offline with historical traffic and vehicle speed data. The traffic speed prediction result

is then plugged in to HMMs to achieve vehicle speed prediction on each road segment along

the driving route.

1.2 Thesis Organization

The rest of the dissertation is organized as follows. Chapter 2 describes the design of a

smart microgrid energy ecosystem. Functions and relations of system roles are first described in

4


the overview. The modeling and implementation of DR and DER management, as well as the bill

balancing algorithm, are then presented with details. Finally, power management simulation results

are shown in figures and the performance improvement is analyzed. In Chapter 3, we consider the

design of a V2G reactive power compensation system based on optimal decentralized scheduling. The

system scheme and charger’s model are first described. A multi-objective optimization formulation for

PEV scheduling is then presented. Algorithms are also introduced for Pareto points solving, problem

linearizion, and decentralized problem solving. Simulation is carried out based on a commercial

area in Boston and different case configurations are considered. From the results, benefits of both

PEV owners and utility grid are analyzed. Chapter 4 focuses on the design and implementation of

our proposed on-road PHEV power management system. The PHEV powertrain model and power

management objectives are first discussed. The design of two-level power management is shown

with the scheme diagram, formulation, and algorithm selection. The proposed power management

system is then tested on Toyota Prius in simulation. Simulation results are analyzed and the fuel

consumption of proposed system is compare with other existing systems. Our traffic and vehicle

speed prediction work is presented in Chapter 5. The scheme of two-level prediction system in

vehicular networks is described with a diagram. Details of each level prediction design are further

discussed. Simulation results are presented and the prediction accuracy is compared with other

methods. Chapter 6 concludes the whole dissertation.

5

Chapter 2

Optimal Energy Management in smart

microgrid

2.1 Background and Motivation

Recent works have focused on the design and analysis of DR and DG management systems

for smart home and smart grid. Work [10] discuss challenges relating to load forecast and DR.

Work [11] shows that the unpredictable human factors can influence DR system’s performance

significantly. Existing DR systems are designed with deterministic or stochastic algorithms in

centralized or decentralized ways. A stochastic dynamic programming method for electricity usage

is proposed in [12]. It assumes that system states’ transition probabilities, e.g. utility power price

and outdoor temperature, are known information. Work [13] proposes a residential DR algorithm

using Q-learning, which takes stochastic load demand, electricity price, and user’s convenience

into consideration. However, Q-learning can hardly be applied to complex tasks and price models

because the convergence speed is low when state and action space dimension are large. In work

[14], a decentralized load management control system is proposed based on real-time price. Overall,

most of existing methods neglect the interaction between DR system and the user, i.e., improving

accuracy of decisions by observing user’s manual adjustment. Therefore, these DR systems cannot

accommodate users’ preference changes and may give unsatisfactory decisions.

For DG management in microgrids, different optimization methods have been implemented.

Some work is based on the static load and weather forecast [15, 16], which doesn’t consider their

dynamic and stochastic characteristics in real situation. Work [17] proposes a scheduling method for

6

CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID

hybrid supplies considering stochastic elements. The obtained decisions are optimal for the average

performance of all possible situations, but cannot adapt to an actual instance. In work [18], a robust

energy management for microgrid with intermittent renewable energy resources is proposed and the

worst-case transaction cost is included in the cost function. As a new type of clean DG with high

energy efficiency and low emission, micro combined heat and power systems (µCHPs) have recently

attracted much attention and become a promising DG in residential homes. µCHP control strategies

can be categorized as heat-led, electricity-led, and cost-led [19, 20, 21]. In heat-led or electricity-led

strategy, the µCHP generates energy whenever there is an electricity or heat demand, respectively.

The Cost-led strategies proposed in [20, 21] utilize the characteristic of co-generation to achieve

the minimum overall cost. Under this strategy, extra electricity will be exported to the utility grid

and additional heat will be consumed by thermal energy storage, like a hot water tank. The existing

strategy only focuses on the optimal operation of a single µCHP. The issue of coordination among

multiple µCHPs for cost reduction in smart microgrid has not been addressed and will be explored in

our proposed energy ecosystem.

2.2 System Overview

The ecosystem is described in Fig. 2.1 with 6 interacted components: utility grid, renewable

energy, DG, storage, appliance and users. Behaviors of one component will affect those of others.

DG generates energy locally by harvesting renewable energy resources or using fuel provided by

utility company. Utility grid, DG, and energy storage provide energy to appliances which will provide

services to users. Extra generated energy is stored in battery and thermal energy storage for future

use. The operation time of appliances is scheduled in DR for cost effectiveness. Finally, users enjoy

services and pay their bills. The scheme of the microgrid studied in this dissertation is shown in

Fig. 2.2. It works in the grid-connected mode and includes three types of flows: electric power flow,

thermal power flow and information flow. The electric power demand is supplied by the utility grid,

centralized wind turbines and batteries, and distributed µCHPs. The thermal power demand in each

individual house is supplied by µCHP or electric heat pump. Wind turbines are shared by the whole

community. Each household has its subscription rate indicating the amount of wind and battery

power that can be used. µCHPs generate power according to the load demand in the microgrid. In the

grid-connected microgrid, power generation and consumption are balanced. When load demand is

higher than generation, extra power will be supplied by utility grid. If extra electricity is generated, it

will be sold back to the utility grid. Current policies usually allow wind energy to be sold with retail

7


Biomass

Figure 2.1: Energy ecosystem in smart microgrid

rates while µCHP energy with lower avoided cost rates [22]. For general consideration, some houses

in the community are installed with µCHPs, whereas others are not. For the latter, thermal energy

can only be provided by electric heat pump. Extra thermal energy generated by µCHP can be either

stored in the hot water tank or dumped. The temperature of water tank should also be maintained

within a range along time. Batteries belong to the community. They discharge in peak hours to reduce

cost. They also work as standby power supplies for emergent blackouts. We select the vanadium

redox battery (VRB) rather than the conventional deep cycle lead-acid batteries because VRB has

much longer life cycle, higher efficiency, and lower discharging cost [23, 24, 25]. The information

flow contains utility power price, wind power prediction, users’ input, system status, control signals

from agents, etc.

The three-level hierarchical optimization is depicted in Fig. 2.3. Load demand and power

supply in the system are decoupled in this hierarchy. DGs can be categorized into two types:

uncontrollable ones determined by external environment like wind turbines , and controllable ones

like µCHPs. The hierarchical optimization first realizes DR based on the wind power generation

8


Inverter

DA

Distributed Agent

DA DA

Water Tank

Micro CHP

Circuit Breaker

Smart Meter

Centralized Agent

Wind

Turbine VRB

Electric

Power Flow

Thermal

Power Flow

Information

Flow

Heater

Figure 2.2: Scheme of the microgrid for a community

and then manages µCHPs’ generation and VRB discharging. The time resolutions at different

optimization levels are set the same for easy synchronization.

The lowest level is DR executed by the distributed agent (DA) of each house for energy

consumption cost reduction with users’ satisfaction taken into consideration. Distributed DRs can

significantly reduce computational complexity without losing much optimality. In each house, DA

collects relevant external information, e.g., day-ahead time-varying utility price and wind power

prediction, and realizes dynamic DR at every decision time, i.e., every hour. DR results include

the starting time of each schedulable task. The optimization formulation in DR considers power

supply from utility grid and wind turbine (subscribed wind power of each household), but not

µCHPs or VRB. This decoupling is reasonable since the cost and ability of µCHP generation do not

vary along the time. The VRB discharging capability does not change much unless it is depleted.

Thus, DR optimization results are not affected much when power supply from µCHPs and VRB

are not considered. DR results are updated dynamically in each decision period with new external

information or new added load.

At the second level, a centralized agent gets load demand of each house from DR decisions

and optimizes µCHPs’ generation. At one time, some houses may have high electric load demand

that cannot be supplied merely by their subscribed wind power and µCHP self generation, while

others with low demand do not need µCHP generation. Thus, this dissertation considers the potential

9


Dynamic DR

(Distributed Agent)

Load Demand

Wind Forecast

Utility Price

CHP Management

(Centralized Agent)

VRB Management

(Centralized Agent)

Load

Scheduling

Micro CHP

Generation

Unsupplied

Load by Wind

Unsupplied Load

by Wind and CHP

VRB

Discharging

Extra Wind

Power

Bill Balance

(Centralized Agent)

Cost of Each

House

Subscription

Rate Balanced Bill

Deterministic

or Stochastic

Input

Figure 2.3: Microgrid management based on hierarchical optimization and bill balance

improvement of energy generation efficiency by coordinating distributed µCHPs during optimization

and proposes the shared cost-led µCHP management strategy. In this strategy, instead of generating

power for its own house, generation of all µCHPs is coordinated to minimize the cost of the whole

community. The optimization agent first calculates the remaining load demand in the community

after deducting the predicted wind power supply. Since µCHPs generate electric and thermal power

simultaneously and have higher power output than battery, generation of µCHPs is first optimized

at this level to supply the remaining load. The optimization considers DR decisions from the first

level as well as utility power price and wind power prediction. VRB discharging is not considered in

optimization at this level.

The last level optimization is for VRB charging and discharging. Different from µCHPs

power generation, VRB can respond fast to load changes with charging/discharging. So its dis-

charging is optimized to compensate stochastic load and insufficient µCHP power generation in

the microgrid at the final stage. VRB is charged by the extra wind power if it is not full and the

power selling price is low. Obtaining an optimal discharging policy is challenging, determined by

the stochastic environment, i.e., load demand, utility price, and future available wind power. One

10


simple strategy is to discharge VRB whenever extra power (in addition to wind power and µCHP

power) is needed. It is not optimal because it does not consider the varying electricity price. Another

policy is to discharge VRB only in periods when utility electricity price is high. However, this will

create the situation that VRB is kept fully charged at most time and therefore the surplus wind power

cannot be stored. Mathematical modeling of an environment model for the microgrid is complex

and impractical. Therefore, we propose a reinforcement learning-based VRB discharging strategy

by evaluating decisions’ immediate and subsequent effects on the ecosystem. The centralized agent

gets load demand and µCHPs and wind generation information, and takes into account their possible

stochastic changes during policy making. With reinforcement learning, the discharging policy can be

obtained from the interaction between the agent and the environment without establishing its detailed

models.

With the three-level hierarchical optimization, the energy consumption cost for the whole

community is minimized. To guarantee fairness for all households, their utility bills need to be

balanced according to their energy consumption and generation at different time. For example, a

house with low subscription rate may have high load demand which consumes supplies from others’

subscribed wind power or µCHP generation. It is unfair for the latter to pay more gas fees or provide

their own wind power for the former without bill balance.

2.3 Design of Distributed DR Systems

2.3.1 DR Model

DR gives load scheduling decisions which are updated dynamically at different time

according to the change of load demand and wind power prediction. Electric load demands are

either schedulable or fixed energy consuming tasks. A schedulable task can be assigned to operate at

different time with different user’s satisfaction. For example, the working of laundry machine and EV

charging are schedulable tasks. It is also assumed that a scheduled task cannot be interrupted once it

starts. On the contrary, a fixed task is time-sensitive and must be executed at designated time, such

as the operation of refrigerator, watching TV programs at specific time, and turning on heating and

air conditioning by house residents. DR is designed for schedulable tasks and will give the optimal

scheduling solution at each decision time. A DR decision time point can be in three situations: at the

beginning of each hour, when a user adds new tasks, or when a user intends to adjust the scheduling

decisions.

11


New Requested Tasks

Pending Tasks

Interfered Tasks

Started Tasks

Fixed Tasks

Dropped Tasks

Scheduling

Decision

Utility Power Price

Wind Power Prediction

Task Preference Rate

Functions

Schedulable

Unschedulable

Input

Update

Dynamic Task Array

External Information

Distributed Agent

Input

Update

Input

Update

Figure 2.4: Model of Dynamic DR

The DR model diagram is shown in Fig. 2.4. It consists of dynamic task array, external

information, and task preference rate functions as input. The output is DR scheduling. The scheduling

results will be updated according to new available information at each decision time. Dynamic task

array consists of six types of tasks as listed below. Only the first three types are schedulable in DR.

• New requested tasks: new tasks requested by the user.

• Pending tasks: scheduled tasks which do not yet start.

• Interfered tasks: tasks whose scheduling time is adjusted by the user.

• Started tasks: scheduled tasks that have already started.

• Fixed tasks: tasks strictly required to be executed at certain time.

• Dropped tasks: tasks dropped by the agent considering the maximum power constraint.

The external information includes the day-ahead time-varying utility electricity price and

hourly updated wind power forecast. Each task is associated with a preference function, which is

designed to indicate a user’s varying satisfaction dependent on the task’s starting time. Preference

functions are updated dynamically according to users’ preference change due to summer/winter

time switch, weather change, holiday seasons, short term change of living habit caused by irregular

working agenda, etc.

At each DR decision time k, the preference function F kpr,i(t) for task i is a function of task

starting time t. The preference function is based on fki (t), which is the estimated probability density

function (pdf) of task i’s starting time t. As a non-parametric density estimation method, the kernel

12


density estimation (KDE) method has broad applications in the univariate case [26] and is suitable

for estimating fki (t). Initially, f0i (t) is estimated from the historical task execution record as:

f0i (t) =1

Nh

N∑n=1

K

(t− Tnh

)(2.1)

where K is a symmetric probability density function, e.g., Gaussian density function, called kernel

function. Tn is the nth sample in the data set. N is the total number of samples in the data set. h

is the smoothing parameter called bandwidth, which determines the trade-off between estimation

bias and variance. Since the performances of different kernel functions are very similar, Gaussian

kernel is selected with its convenient mathematical properties. h is selected to minimize the mean

integrated square error (MISE) defined as:

MISE(f) = E

∫ [f(t)− f(t)

]2dt (2.2)

For Gaussian kernel, the optimal bandwidth is h∗ = 1.06σN−1/5, where σ is the sample standard

deviation. fki (t) at time point k is updated by processing new samples, i.e., tasks’ actual starting

time in either a regular way or with weighted update. The main idea is to weight user’s adjustment

and learn user’s preference change faster. If a task is scheduled by DR and accepted by the user, the

sample is processed with an ordinary update. If the scheduling is not accepted and rescheduled by

the user, it is updated with a weight M . The exact value of M is determined by a tunable parameter

ρ(ρ > 0) as M = max(2, bρNc), where N is the size of data set. The data set has its capacity.

When the data set is full and new samples come, the oldest ones will be replaced. F kpr,i(t) is set to be

equal to the normalized pdf fki (t)= fki (t)/max{fki (t)}.

2.3.2 Optimization problem formation

The length of a DR cycle is set at 24 hours for scheduling tasks. Because users desire more

satisfaction with lower cost, the optimization at each decision time is to minimize the unit cost, the

energy consumption cost per satisfaction, for the current and remaining time in the cycle. Monetary

cost is calculated as the product of electricity price (utility price and wind power price), load power

demand, and load duration. After discretization with time resolution τ , the energy consumption cost

at DR decision time kn for all tasks (i = 1, ..., I) is formed as:

C(kn) =

ND∑k=kn

[RW(k)ELW(k) +RG(k)ELG(k)

](2.3)

13


where ELW(k) and ELG(k) are wind and utility grid energy used in time slot k, respectively. Con-

sidering both the schedulable tasks and fixed tasks, ELW(k) is calculated according to user’s wind

power subscription rate αs and its price is RW(k). The extra energy demand ELG(k) will be supplied

by the utility grid with price RG(k). Power consumption from wind PLW(k) and grid PLG(k) at time

k are formulated as:

PLW(k) = min

{ ∑i∈IS,kn

Pi(k) + PF(k), αsPW(k)

}

PLG(k) = max

{0,∑

i∈IS,kn

Pi(k) + PF(k)− αsPW(k)

} (2.4)

where Pi(k) is the power consumption of task i at time k. Each task i requires TR,i time slots for

operation. Pi(k) equals to the rated power PR,i if time k is within the task operating time. Otherwise,

it is 0. The total satisfaction one user can get at decision time kn by scheduling tasks is:

U(kn) =∑i∈Ikn

Ui(kn) =∑i∈Ikn

uisi(kn)F knpr,i

(ksh,i(kn)

)(2.5)

Other variables and parameters include:

a) Control variables at decision time kn: si(kn) is a binary value indicating the scheduling decision

for task i at kn. “1” means task scheduling and “0” means not. It determines the set of scheduled

tasks IS,kn after decisions at time kn are made. ksh,i(kn) is the scheduled starting time of task i.

b) Power parameters: PW(k) and PF(k) are predicted total wind power supply (kW) and total load

of fixed tasks (kW), respectively, at time k.

c) Time parameters: ND is the number of time slots in one DR cycle.

d) Other parameters: RG(k) and RW(k) are utility electricity price (USD/kWh) and wind power

price (USD/kWh), respectively, at time k. ui is the weight coefficient reflecting the importance of

task i. Ikn is the set of tasks have not started till the beginning of DR decision at kn.

DR optimization constraints include: first, all scheduled tasks should be completed before

the end of DR cycle. No tasks are allowed to be postponed to the next day. Second, one task may

depend on the completion of another one, for example, a dryer can start to work only after the laundry

machine finishes washing. Third, execution time of each task should be scheduled between current

decision time and the end of DR cycle. At last, each house has its maximum allowed power restricted

by the circuit breaker.

14


Tasks are allowed to be dropped when some constraints cannot be satisfied, such as the

situation when a user has tasks with high rated power which causes the total power exceeds the

maximum allowed one at any time. Task dropping penalty PT is thus introduced. The optimization

is to minimize the unit cost with task drop penalty considered:

minC(kn)

U(kn)+

[|Ikn | −

∑i∈Ikn

si(kn)

]PT

s.t. DR constraints

(2.6)

2.4 Shared Cost-led µCHPs Management

Operations of distributed µCHPs are optimized according to the load scheduling results

from the DR system. The µCHP model described in [21, 17] is applied in this dissertation.

2.4.1 µCHP Model

A µCHP unit has three statuses: idle, start-up and generation. When the system is idle,

there is no fuel consumption and power generation. In the start-up period, fuel will be consumed

without power generation. After start-up, the system consumes fuels and generates both electric

and thermal power. The efficiency of µCHP unit, η, denotes the percentage of total useful power

generated from fuel input. Specifically, electric efficiency ηE and thermal efficiency ηT indicate the

proportion of generated electric and thermal power. The total energy generation of µCHP unit at

time t is [21, 17]:

PC(t) = ηSC(t)gF (t)qF (2.7)

where:

SC(t)– binary status of µCHP. “0” for idle and start-up status. “1” for generation status.

gF (t)– fuel stream input(Nft3/s).

qF – heating value of fuel(kJ/Nft3). Thus, the generated electric power is PCE(t) = ηEPC(t) and

thermal power is PCT (t) = ηTPC(t).

2.4.2 Shared Cost-led µCHPs Management Strategy

In view of the dynamic characteristic of the micogird discussed above, including DR and

wind power forecast, it is important to ensure fast response of µCHPs to the load and supply changes.

15


Otherwise, more power will be consumed from utility grid and extra cost will be induced. When

a µCHP unit is on, the fuel cell power output can be controlled to respond to the input change

within 30 seconds [27]. Frequent status change not only wastes lots of fuel on start-up but also

slows down the response speed. Thus, one possible strategy is to keep all µCHP units in generation

status and adjust their power output according to load demand by controlling the fuel stream input

gF (t). When a µCHP is on, there is minimum heat generation and may exceed the requirement.

However, there is only a limited amount of extra generated heat that can be stored in water tank.

First, the water tank has acceptable temperature range with most desired value. Second, heat dumps

impose negative effects on environment, which are usually restricted. In addition, this strategy is not

always cost effective. Therefore, it is important to determine the optimal “on/off” state of µCHPs

in advance according to available information. Since the amount of heat dump is limited, thermal

power generation should be constrained to keep desired water tank temperature.

The shared cost-led µCHPs management is formed as a two-level optimization problem.

The main idea is that the coarse-grained optimization has long-term perspectives and will guide the

fine-grained optimization in terms of µCHP state and thermal power generation along the time. The

fine-grained and coarse-grained optimization have time resolutions τ (slot, same as DR resolution)

and TC (period), respectively. TC is an integral multiple of τ . The CHP start-up time TS is also set

to an integral multiple of TC. The fine-grained optimization is to determine the detailed optimal

µCHP fuel input stream and electric heat pump generation for each slot τ in the current period TC

to minimize the energy consumption cost of the whole community. The coarse-grained one is to

minimize the sum of the approximate cost of the community in next NP coarse-grained periods

by determining the optimal µCHP states, the average fuel input volume, and the average electric

heat pump generation in each period. The solved optimal µCPHs’ states will be used for µCHPs’

state transitions. The total thermal generation in each period will serve as a constraint for the

fine-grained optimization in that period. For instance, if the status of one CHP unit is “off” in current

coarse-grained period but is preferred to generate power in next period (assuming the start-up takes

one period), the system will prepare for start-up in current period. This dissertation sets NP = 6 and

NS = 1. With this strategy, the predicted information is utilized, system responsibility is guaranteed

and optimal solution for cost reduction is obtained.

16


2.4.2.1 Fine-grained Optimization for Current Period

There are NC =TC/τ time slots in one period. The total load demand in the community

PSL(n) in time slot n is calculated according to DR decisions as PSL(n) = PF(n) +∑

i∈IS,n

Pi(n).

The electric power consumption cost CCE,k(n) and fuel consumption cost CCF,k(n) of the whole

community in time slot n of current period k are formulated as:

CCE,k(n) = RG(n)EEG(n) +RWEEW(n) (2.8)

CCF,k(n) = RFτ∑m∈MG

gF,m,k(n) (2.9)

where τ∑

m∈MG

gF,m,k(n) in (2.9) is the total µCHP fuel input volume within time τ . The fuel consump-

tion CCF,k is obtained as the product of fuel volume and fuel price. Variables and parameters in (2.8)

and (2.9) include:

a) Control variables: gF,m,k(n) is the fuel input stream of µCHP m at time n (Nft3/s).

b) Energy and power terms: EEW(n) and EEG(n) are calculated energy consumption (kWh) from

wind power supply and utility grid, respectively, at time n according to the scheduled load

demand PSL(n) and µCHP electric power generation PCHPE(n) in the microgrid. PCHPE(n) =

ηEqF∑

m∈MG

gF,m,k(n) where ηE is the electric efficiency of µCHPs and qF is the heating value of fuel

(kJ/Nft3).

c) Other parameters: RF is the fuel gas price (USD/Nft3). MG denotes set of houses with µCHP

states “generation”.

Extra generated power is sold to the utility grid with income BCM,k. The optimization problem at the

current decision time k is to minimize the total cost in the period as:

min

NC∑n=1

CCE,k(n) + CCF,k(n)−BCM,k(n) (2.10)

subject to the following constraints: First, the thermal generation of each house should be equal to

the value solved from coarse-grained optimization. Second, both fuel input and electric heat pump

generation have their allowable ranges. Finally, with electric heat pump added, the total electric

power consumption in a house cannot exceed the maximum value.

17


2.4.2.2 Coarse-grained optimization for Future Periods

The coarse-grained optimization has to consider NP periods. The cost function is the sum

of approximated cost of the whole community in these NP periods. Control variables are each CHP’s

state (binary values, 0 for idle state and 1 for generation state), its average fuel input volume (for the

units at generation state), and electric heat pump power consumption for heat generation. For each

time period, its approximated cost has the same formation as the fine-grained one, except that it has a

larger time resolution TC.

In addition to the constraint of thermal generation and maximum load, other constraints

include: first, in each periods, the temperature of water tanks should be maintained within in a range;

second, at a designated time, the temperature of water tank should reach the set point as the desired

average temperature; third, the maximum allowable heat dump is constrained.

2.5 VRB Discharging Management with Q-Learning

In this third level optimization, VRB will be optimized for discharging to supply the

remaining load after consuming wind and µCHP power at the first two level. This happens when

the load demand is high but the wind power is low or the µCHP generates insufficient power for

stochastic load demand. The efficiency of VRB is determined by its charging/discharging current

and the state of charge (SOC) with nonlinear characteristics. To keep high battery efficiency, the

charging/discharging current and SOC are constrained within certain ranges, in which the efficiency

can be approximated to be a constant value [25, 28].

The stochastic load demand and wind power can be modeled as Markov chains. VRB

management is formed as a Markov decision process (MDP) with the decision time resolution τ . At

decision time k, the system state space can be described as:

X(k) =[RG(k), PW(k), EINS(k), SDOD(k)

](2.11)

where EINS(k) is the state of remaining load demand energy calculated by applying wind power

PW(k) and µCHP generation PCHPE(k) to load demand PSL(k). EINS(k) = max{0, τ [PSL(k) −PW(k) − PCHPE(k)]}. SDOD(k) is the depth of discharge (DOD) state of VRB. The action is to

discharge µ(k) percent of EINS(k) from VRB at decision time k. Actions are constrained by

minimum and maximum VRB discharging power, as well as VRB’s SOC. The reward function for

action µ(k) is designed considering both the cost saving from discharging and system stability with

18


battery backup energy:

r(k) = a1(k)− λa2(k) (2.12)

where

a1(k) =

[RG(k)−RB

]µ(k)EINS(k)(

RG,max −RB)ED,max

(2.13)

a2(k) =∆SDOD(k)

1− SDOD(k + 1)(2.14)

a1(k) is the normalized cost reduction for the microgrid with VRB discharging. RB is the VRB

discharging cost (USD/kWh). RG,max is the maximum value of time-varying utility price. ED,max is

the maximum VRB discharging energy in a decision period. a2(k) is the normalized battery DOD

change weighted by battery SOC state at time k + 1. λ is a positive weight for a2. When a1(k) is

larger, which indicates more cost saving is achieved, the action is considered as cost effective and

denoted with larger reward. On the other hand, larger a2(k) means more energy is discharged from

VRB and therefore less energy is available as backup. In that case, the reward is reduced.

The MDP will find the optimal policy h∗ and action uk = h∗(X(k)) to maximize the total

reward that discounts the future rewards with a factor γ:

R =

∞∑n=0

γnr(k + n) (2.15)

2.6 Problem-solving Algorithms

The DR is formulated as a nonlinear integer programming problem. For the shared cost-led

µCHP management, the fine-grained optimization for the current period is a linear programming

problem and the coarse-grained optimization for the future periods is a mixed integer nonlinear

programing. These nonlinear programming problems are non-convex and finding their global optimal

solutions is NP-hard. Therefore, in DR and µCHP management, local optimal solutions are solved

by Particle Swarm Optimization (PSO) algorithm in real-time for practical operation [29]. Works

[30, 31, 32, 33] have evaluated PSO on different benchmarks and shown good solution qualities.

PSO has many advantages over other evolutionary algorithms, like Genetic Algorithm (GA) [34, 35].

First, PSO has more effective memory capacity and better diversity for optimal solution search.

Second, PSO has faster search speed which is important for highly dynamic systems, such as the DR

and shared cost-led µCHP management.

In PSO, a swarm S is a set of particles S = {x1, x2, . . . , xN}. N is the number of particles

participating in the solution search. Each particle is a vector xi = {xi,1, xi,2, . . . , xi,M}T indicating

19


its position in a M -dimension as the solution to minimize the cost function F (xi). The dimension

of each particle depends on the number of control variables. Each particle also has its velocity as

vi = {vi,1, vi,2, . . . , vi,M}T as the shift of position in each iteration. The swarm of particles will

update their velocities and positions, in each iteration k, towards target solution (with minimum

cost) by utilizing both individual best position pi(k) = {pi,1(k), pi,2(k), . . . , pi,M (k)}T and global

historical best position pg(k) = arg miniF (pi(k)). The update is realized according to the following

equations:

vi,j(k + 1) = ωvi,j(k) + c1r1[pi,j(k)− xi,j(k)

]+ c2r2

[pg,j(k)− xi,j(k)

]xi,j(k + 1) = xi,j(k) + vi,j(k + 1)

(2.16)

where r1 and r2 are random variables with uniform distribution in [0, 1]. c1 and c2 are acceleration

constants. ω is the inertial weight, a value decreasing with time. To prevent swarm divergence, the

velocity of jth component vi,j(k+1)is clamped as |vi,j(k+1)| ≤ Vmax,j = (bj−aj)/2 as a common

selection, where [aj , bj ] is the feasible region of xi,j . For the constraints of optimization, the method

of using penalty function and preserving feasibility of solution during initialization is adopted[32, 36].

For integral variables with a discrete search space, Discrete Particle Swarm Optimization (DPSO)

with rounding techniques proposed in [33] is used. For binary variables, a binary version of DPSO

with sigmoid function is used [37].

As a model-free reinforcement learning technique, Q-learning [38] is used to obtain optimal

VRB discharging policy through interactions with the environment. At decision time k, Q-learning

observes system state X(k), takes an action µ(k), evaluates reward r(k), and updates the Q value

with a learning rate α ∈ [0, 1] and discount factor γ. The Q-learning algorithm with ε− exploitation

is shown in Algorithm 2.

2.7 Bill Balancing Algorithm

To ensure fair energy usage for all households in the ecosystem, their bills need to be

balanced according to their energy consumption and generation along the time. Wind and battery

energy are allocated to a household with its subscription rate. In bill balancing, it is assumed a

household utilizes all of its subscribed wind and battery energy, supplying its load and selling the

extra energy to the utility grid. The bill balancing for µCHP energy generation and consumption is

more complex. At one time, a household performs as either a contributor (i ∈MC) or a beneficiary

20


Algorithm 1 PSO algorithm1: Initialize each particle’s position and velocity randomly;

2: For particles with infeasible position, randomly adjust their components until all initial positions

are feasible;

3: for each iteration time do

4: for each particle do

5: Calculate fitness: fitness = cost function value + penalty;

6: if New individual best position is found then

7: Update individual best position;

8: end if

9: end for

10: if New global best position is found then

11: Update global best position;

12: end if

13: for each particle do

14: Update velocity according to individual and global best position;

15: Apply velocity clamping;

16: Update position;

17: end for

18: end for

19: Continue iteration if termination condition is not satisfied.

(i ∈ MB). A contributor exports a part of its µCHP electric energy to the microgrid or reaches a

balance between generation and demand without export. On the contrary, a beneficiary consumes

electric energy from other µCHPs in the microgrid. Their relationship in the microgrid is shown in

Fig. 2.5. In each time slot, a household i equipped with µCHP consumes fuel FCHP,i and generate

electric energy ECHPE,i and thermal energy ECHPH,i. ECHPE,i first supplies its own electric load E′L,i,

which is the remaining load of household i after utilizing its subscribed wind and battery energy.

EL,CHPE,i is the part of electric energy supply from µCHPs. For a contributor i, ECHPE,i ≥ EL,CHPE,i

and EoutCHPE,MG,i is exported to microgrid required by other households. The remaining generation is

sold to the utility grid as EoutCHPE,UG,i. For a beneficiary j, Ein

CHPE,MG,j is imported from other µCHPs

to supply its demand which is larger than EL,CHPE,j = EinCHPE,MG,j + ECHPE,j . The total µCHP

electric energy import matches export inside the mirogrid. With fairness consideration, EinCHPE,MG,j

21


Algorithm 2 Q-learning algorithm with ε-exploitation1: Q value initialization;

2: Initial state measurement;

3: for each step k, select action u do

4:

uk ←

random action selection with probability εk

u ∈ argmaxu′Qk(xk, u′) otherwise

5: Taking action uk, observe xk+1 and rk+1;

6:

Qk+1(xk, uk)← Qk(xk, uk) + αk[rk+1

+ γmaxu′

Qk(xk+1, u′)−Qk(xk, uk)

]7: end for

drawn from the microgrid should be proportional to the household’s load E′L,j . The net benefit a

household obtains from µCHP generation in the microgrid is determined by three parts: cost saving

from self-generation, cost saving from importing energy from microgrid, and fuel consumption

cost. It is fair for a household to get the full benefit from self-generation. Cost saving from energy

exporting/importing is achieved by both contributors and beneficiaries. Contributors also consume

more fuels to generate energy for beneficiaries. Therefore, the first two parts need to be balanced

among households.

Beneficiary Microgrid

Beneficiary

Beneficiary

…

…

Contributor

Contributor

Contributor

out

CHPE,MG,iE

in

CHPE,MG, jE

i

j

CHP,iF

L,CHPE,iE

out

CHPE,UG,iE

L,CHPE, jE

CHP, jF

CHPE, jE

Figure 2.5: µCHP electric energy flow among contributors and beneficiaries in the microgrid

22


In time slot k, the balanced bill of a household i has the following formulation:

BBi(k) =C ′G,i(k) + CW,i(k) + βiCF(k)− µiBCHP,share(k)

−BCHP,self,i(k)(2.17)

where:

C ′G,i(k) = RG(k)[EL,i(k)− αs,i

(EW(k) + EB(k)

)]CW,i(k) = RWαs,iEW(k)

CF(k) = RF∑

j∈MCHP

FCHP,j(k)

BCHP,share = RG(k)∑j∈MC

EoutCHPE,MG,j(k)

BCHP,self,i(k) =RG(k) min{ECHPE,i, EL,CHPE,i}

+RA(k)EoutCHPE,UG,i(k)

C ′G,i(k) is energy consumption cost of household i when only wind, battery, and utility grid energy

are considered as supply. C ′G,i(k) can be negative, which means subscribed wind and battery energy

is larger than its demand and the extra energy is sold to the utility grid. CW,i(k) is the charge of wind

turbine maintenance. CF(k) is the total fuel consumption cost for µCHP generation in the microgrid.

BCHP,share(k) is the total cost saving achieved by exporting/importing µCHP electric energy inside

the microgrid. BCHP,self,i(k) is the cost saving achieved by household from self-µCHP generation.

MCHP is the set of households with µCHPs. EL,i(k) is the electric energy demand of household i

in time slot k. EW(k) and EB(k) are total wind energy generation and battery energy discharging,

respectively. αs,i is the wind and battery energy subscription rate of household i. Different from

renewable energy, µCHP energy is sold with an avoided cost rate RA(k) which is lower than the

retail rate RG(k). To fairly balance CF(k) and BCHP,share(k) for each household, ratios βi and µi

should be well designed. It is fair for a household with larger EL,CHPE,i , EL,CHPH,i, and EoutCHPE,UG,i

to pay more for fuel consumption. BCHP,share(k) is achieved by µCHP energy sharing inside the

microgrid, which should be balanced according to EoutCHPE,MG,i and Ein

CHPE,MG,i of each household.

Thus, two metrics UCF,i and Ushare,i are designed to describe the fairness of balancing CF(k) and

23


BCHP,share(k), respectively, as

UCF,i = βiCF(k)/ECHPE,use,i (2.18)

Ushare,i =

µiBCHP,share(k)/EoutCHPE,MG,i i ∈MC

µiBCHP,share(k)/EinCHPE,MG,i i ∈MB

(2.19)

where:

ECHPE,use,i = (EL,CHPE,i + EoutCHPE,UG,i)/ηe + EL,CHPH,i/ηh

UCF,i is the unit fuel cost per µCHP energy usage, i.e., supplying load and selling to utility grid, in

which energy are weighted by µCHP electric efficiency ηe and thermal efficiency ηh. EL,CHPH,i is

the part of ECHPH,i for water tank heating. Ushare,i is the unit cost saving per µCHP electric energy

export/import inside the microgrid. βi and µi are designed following two rules. First, in consideration

of fairness, UCF,i , as well as Ushare,i, of each household should be equal. Second,∑i∈M

βi =∑i∈M

µi = 1

, where M is the set of all households in the ecosystem. Thus, βi and µi can be selected as:

βi = ECHPE,use,i/∑j∈M

ECHPE,use,j (2.20)

µi =

Eout

CHPE,MG,i/(2∑j∈MC

EoutCHPE,MG,j), i ∈MC

EinCHPE,MG,i/(2

∑j∈MC

EoutCHPE,MG,j), i ∈MB

(2.21)

After βi and µi are determined, the balanced bill for each household can be calculated according to

(2.17).

2.8 Simulation and Result

2.8.1 Simulation Configuration

The simulation platform is implemented with Java. The community is configured with 10

houses and 4 µCHPs. Suppose residents leave home at 8 AM, and each day starts at 8 AM and ends

at 8 AM of the next day. Each house is configured with its own fixed load, schedulable tasks (EV

charging, laundry machine, dryer, PC downloading, etc.) and preferred execution time periods. The

simulation for one week is first evaluated.

The wind velocity is generated according to Rayleigh distribution with an average speed

20m/s. It is assumed that there is 0-30% variance between each hourly updated wind forecast. There

24


is also 0-20% variance between the forecasted and actual wind power generation. The wind turbine

has 20kW rated power output, 3.1m/s cutting-in speed, 13.8m/s rated speed and 54m/s max speed.

The wind power generation is shown in Fig. 2.6.

1 2 3 4 5 6 7 80

5

10

15

20

Day

Win

d Po

wer

(kW

)

Figure 2.6: Wind turbine power generation

µCHPs are modeled with electric efficiency 0.27, thermal efficiency 0.63, gFmin =

0.0013Nft3/s, gFmax = 0.009Nft3/s. The hot water tank has temperature set point 65◦C with al-

lowable range±3◦C. VRB is first set with capacityEcap = 10 kWh for overall system evaluation. Its

discharging power is constrained with PD,min = 0.5 kW and PD,max = 4 kW. Its charging/discharging

round-trip efficiency is set to be 0.8. The time-varying utility price for simulation is generated based

on the critical peak pricing (CPP) model [39] and shown in Fig. 2.7. In the hierarchical optimization,

time resolution is set to be τ = 6 minutes. For µCHPs management, parameters are selected with

TC = TS = 30 minutes and NP = 6. In PSO, the number of particles in a swarm is selected as 100 and

500 for DR and µCHPs generation optimization, respectively. The maximum number of iterations is

selected as 1000. w is selected with the initial value 0.9 and c1 = c2 = 1. Q-learning parameters are

selected as α = 0.1, γ = 0.9, and u ∈ [0, 100%].

2.8.2 Result Analysis

The proposed energy ecosystem is first compared with a conventional distribution system

which is configured with DR and µCHPs but without the wind turbine and VRB. In the conventional

system, µCHPs are not interconnect and each of them generates power according to the heat demand

of its own house. The capital cost for a 20 kW wind turbine is about 70000 USD (20 years life-span)

with approximated maintenance cost 1.5% of the investment cost per year [40, 41]. The VRB

discharging cost can be approximated as 0.1 USD/kWh [42]. Even including the cost of the wind

25


1 2 3 4 5 6 7 80

0.1

0.2

0.3

0.4

Day

Util

ity P

rice

(USD

/kW

h)

Figure 2.7: Utility electricity price

turbine and VRB, results in Fig. 2.8 show that large cost reduction can be achieved in the ecosystem.

Performance of the hierarchical optimization is further evaluated. For easy analysis, the wind turbine

investment cost will not be included in the following analysis.

1 2 3 4 5 6 70

20

40

60

80

100

120

Day

Cos

t (U

SD)

Conventional SystemEcosystem (Energy Consumption Cost)Ecosystem (Investment and Maintenance)

Figure 2.8: Cost comparison between our energy ecosystem and the conventional system

2.8.3 Distributed DR Results and Analysis

The update of preference function Fpr is affected by the weight parameter ρ. When ρ is

small, Fpr changes slowly. If ρ is set too large, Fpr is over sensitive even to a single adjustment and

therefore forms bumps, which is inaccurate and will result in DR searching solutions in some local

regions. ρ = 0.1 is selected for the following simulations by observing its good trade-off between

the learning speed and accuracy. The energy consumption cost with and without DR are compared.

A randomly selected house is evaluated for the performance of DR. Results are shown in Fig. 2.9.

26


The normalized satisfaction U is used to present the influence of DR on users’ satisfaction. Without

DR, tasks start at users’ most preferred time with U = 1. U with DR for the house is shown in Fig.

2.10. Results show that with DR, the energy consumption cost of the house in each day has reduced

up to 43% while a high satisfaction is still achieved.

1 2 3 4 5 6 70

2

4

6

8

Day

Cos

t (U

SD)

Without DRWith DR

Figure 2.9: Electricity consumption cost of one sample house in each day

1 2 3 4 5 6 70.5

0.6

0.7

0.8

0.9

1

Day

Nor

mal

ized

Use

r's S

atis

fact

ion

Figure 2.10: Satisfaction degree of one house in each day

2.8.3.1 Centralized Shared Cost-led µCHP Management Results and Analysis

The two-level shared cost-led µCHP management is compared with the heat-led manage-

ment strategy. DR is applied in both strategies. To evaluate the performance, one day is selected

randomly and its electric and thermal load demand of the community are shown in Fig. 2.11. The

total µCHP total power generation and electric heat pump thermal power generation are shown in

Fig. 2.12. The hot water tank temperature of one house is regulated within the preset region shown

27


in Fig. 2.13. At the end of each NP coarse-grained period (every 3 hours), the desired temperature

setpoint is achieved. The energy consumption cost of the whole community in each day compared

with heat-led management strategy is shown in Fig. 2.14. Results show that the shared cost-led

µCHP management can reduce the energy consumption cost of the whole community up to 19%.

The exact cost reduction depends on wind power generation, utility power price and load variance of

the community.

8AM 12PM 4PM 8PM 12AM 4AM 8AM0

10

20

30

40

50

Time

Load

Dem

and

(kW

)

Electric Load DemandThermal Load Demand

Figure 2.11: Electric and thermal load demand of the community in the evaluated day

8AM 12PM 4PM 8PM 12AM 4AM 8AM0

5

10

15

20

25

30

Time

Pow

er G

ener

atio

n (k

W)

CHP Power GenerationHeat Pump Generation

Figure 2.12: Total µCHPs and heat pumps generation in the community in the evaluated day

2.8.3.2 Centralized VRB Management Results and Analysis

The performance of VRB management based on Q-learning is compared with the strategy

in which VRB discharges whenever the wind and µCHP power are insufficient to supply the total

load of the community. The total cost reduction of the community from VRB discharging in an

extended two-week simulation is shown in Fig. 2.15. In direct discharging mode, VRB discharges

28


8AM 12PM 4PM 8PM 12PM 4AM 8AM62

63

64

65

66

67

68

Time

Tem

pera

ture

(C

)

Figure 2.13: Hot water tank temperature in the evaluated house

1 2 3 4 5 6 70

10

20

30

40

50

60

Day

Cos

t (U

SD)

Heat-ledShared Cost-led

Figure 2.14: Energy consumption cost of the community with µCHP system generation

to supply the extra load whenever the utility price is higher than the VRB discharging cost. Cost

reduction increases when higher VRB capacity is applied. This trend becomes less significant when

the capacity is large, e.g., λ = 0.1 with capacity larger than 50 kWh, since VRB cannot be always

fully charged due to the limit of wind power generation. Compared to the direct discharging, the

proposed Q-learning method can achieve higher cost reduction. As λ increases, more weight is given

to energy reservation than load shaving.

29


10 20 30 40 50 60 700

2

4

6

8

10

VRB Capacity (kWh)

Cos

t Red

uctio

n (%

)

=0.1 =0.4 =0.8 Direct Discharging

Figure 2.15: Energy consumption cost of the community with VRB discharging

30

Chapter 3

Vehicle-to-Grid Reactive Power

Compensation


V2G systems provide ancillary services to the grid, such as voltage/frequency regulation,

load shifting, and renewable energy supporting and balancing. Benefits and challenges of V2G

technologies are reviewed in [43]. Some challenges limit the implementation of V2G systems,

e.g., battery degradation, impacts on distribution equipment, and high investment cost. An optimal

scheduling of V2G energy and ancillary services is studied in [44]. The goal is to maximize profits

to the aggregator while providing peak load shaving and system flexibility to the utility and low EV

charging cost to customers. The problem is formulated as a linear programming problem and solved

in a centralized way. However, the scalability issue is not discussed. Work [45] studies the capacity

management of V2G system for voltage regulation with the model of queuing network considering

EVs’ dynamic connections determined by drivers’ habits. The estimated capacity is used to set up

contracts between an aggregator and a grid operator for optimal grid support and maximum profits.

V2G load frequency control is studied in [46] with PEV users’ convenience (battery SOC for driving)

taken into consideration. Results show that the control performance is worse by considering users’

convenience, while this difference becomes smaller as the number of PEVs increases.

The potentials and characteristics of PEV bidirectional chargers working for reactive power

compensation are studied in [8, 47]. Work [48] proposes a V2G reactive power compensation system

for Wind DG units connected with a PEV charging/parking lot. The problem is formed as a two-stage

31

CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION

Stackelberg game and an optimal pricing scheme for the compensation performance is derived. In

work [49], a combined frequency and voltage regulation system based on PEV real and reactive

power compensation is proposed with two joint optimization models implemented, i.e., a command

based model and a price based model. Results show the trade-off between real and reactive power

compensation and the advantage of regulation. However, these works utilize PEVs as fixed VAR

resources belonging to the specific charging/parking lots. Their roles as mobile and flexible VAR

resources for the smart grid are not studied. The compensation performance will be greatly improved

if PEVs are scheduled for charging/parking considering the load profiles. In this case, additional

complexities are introduced to the system design and analysis, such as multi-objective cost function

and scalable problem solving approach, which will be investigated in this work.

3.2 V2G System Description

The power distribution system is assumed to be connected with distributed on-street

charging stations. Each PEV equips with an on-board bidirectional AC charger. The V2G system is

overall a Cyber-Physical system. The Cyber part includes the information platform for reservation,

decentralized PEV parking and charging scheduling algorithm, real-time bus monitoring platform,

and PWM control system for the full-bridge inverter charger. The physical part involves PEVs and

their on-board chargers as actuators. The concentration of our work lies in the PEVs’ reservation and

their parking/charging scheduling algorithm in the Cyber part.

3.2.1 System Overview

The infrastructure of the proposed V2G system can be described hierarchically as two inter-

acting layers, an electrical layer and a geographical map layer, shown in Fig. 3.1. The geographical

map layer shows the streets, the location of charging stations, and PEV owners’ destinations reflecting

PEV owners’ parking convenience. The electrical layer consists of electrical facilities, including

feeders, charging stations, etc. PEVs at charging stations are controlled for both battery charging and

reactive power compensation to grid buses. It is assumed that there are multiple charging stations

connected to the same bus along one road segment. These stations are grouped and modeled as one

charging station with its capacity of parking/charging spaces. A PEV owner drives his or her car to a

charging station from a starting point, parks/charges the car and then walks to the destination. An

example is shown in Fig. 3.1 at the top layer. A PEV owner has two acceptable charging stations

32


within the walking distance from the destination. Thus, two possible driving and walking routes are

labeled as a and b, in which route a is preferred by the owner since it has a shorter walking distance

than b.

PEVs are scheduled for parking and charging according to the system infrastructure and

PEV owners’ day-ahead reservations. Scheduling is necessary to reduce competitions for limited

number of charging stations. Some people may prefer free driving styles without making reservations.

Their random accesses to charging stations cannot be guaranteed and can only be handled on the best-

effort basis. This case is not considered in our work. To make a reservation, a PEV owner submits the

charging request to the scheduling system before a deadline, indicating acceptable charging stations,

energy charging requirement, preferred parking interval, and maximum acceptable walking distance.

The scheduling system will satisfy users’ charging requests, offer them convenient parking service,

and reduce the cost of charge as much as possible. Parking convenience is represented by the parking

interval and walking distance. Each user has a preferred parking interval, which can be adjusted

within a range with a convenience degradation. Users are also sensitive to the walking distance

between stations and their destinations and would prefer the nearest ones. Monetary cost consists of

charging and parking costs. Charging and parking prices are time-varying and different for stations.

In addition to users’ benefits, the scheduling also makes decisions for the optimal reactive power

compensation to the grid.

1 2 3

5 4

a b

charging station

starting point

destination

1

4

2

5

3

charging station transformer feeder load

Figure 3.1: Electrical and geographical map layers of the V2G reactive power compensation system

33


Pch

Qcp

Smax

0 P'ch

Q'cp,max

Q'cpS'

Charging

Inductive

Discharging

Inductive

Charging

CapacitiveDischarging

Capacitive

Figure 3.2: Charger operation mode

3.2.2 Model of On-board Charger

The PEV charger model described in [8, 47] is used in our system. The charger can operate

in one of eight modes shown in Fig. 3.2 according to values of Pch and Qcp, which are real and

reactive power exchange between the charger and the grid, respectively. We do not consider the

modes with battery discharging, i.e, Pch ≥ 0. However, the problem formulations and designed

algorithms can be easily applied to situations with battery discharging by adjusting cost functions

and constraints. The polarity of Qcp represents different modes for reactive power. When Qcp > 0,

the charger operates in an inductive mode and consumes reactive power from the grid, while when

Qcp < 0, it is in a capacitive mode and compensates reactive power to the grid. Smax is the maximum

apparent power that can be sustained by the charger, determined by the grid voltage Vs and the

charger’s maximum allowable current Imax as Smax = VsImax. Smax sets constraints on Pch and Qcp

by subjecting to P 2ch +Q2

cp ≤ S2max in operation. Therefore, for a charging station, its capability of

reactive power compensation Qcp,max is determined not only by the number of connected chargers,

but also the real charging power.

34


3.3 Multi-objective Optimization Formulation

Two objectives are considered in the aggregator for PEV parking and charging scheduling.

One is for PEV agent benefits, including low monetary cost and high parking convenience. The

other optimizes the reactive power compensation to the grid. A multi-objective optimization is then

formulated in the following subsections.

3.3.1 Optimizing PEV Agent Benefits

On the PEV side, the parking cost, charging cost, and parking convenience of all PEV

agents are considered in the optimization. The objective of PEV side optimization is to minimize the

sum of unit costs per convenience for all PEV agents in the target area. For one PEV agent i, the

parking cost Cpk,i and charging cost Cch,i are presented as functions of parking/charging prices and

scheduling variables:

Cpk,i = τ∑j∈Mi

T∑t=1

yi,j(t)Rpk,j(t) (3.1)

Cch,i = τ∑j∈Mi

T∑t=1

Pch,i,j(t)Rch,j(t) (3.2)

where:

yi,j(t) = ui(t)xi,j (3.3)

Mi is the set of parking stations within the maximum acceptable walking distance of PEV agent

i. T is the number of time slots considered in the optimization. τ is the duration of each time slot.

Rpk,j(t) and Rch,j(t) are the parking and charging price of station j at time t, respectively. Control

variables include:

xi,j : binary parking assignment variables. xi,j = 1 indicates PEV i is assigned to station j for

parking.

ui(t) : binary parking status of PEV i. ui(t) = 1 if and only if PEV i parks at time t.

yi,j(t) : binary variables of PEV parking status at station j and time t. yi,j(t) = 1 if and only if

PEV i parks at station j at time t.

ts,i, te,i: parking interval [ts,i, te,i] of PEV i. They are integer variables and can be scheduled

within a range.

35


Pch,i,j(t) : charging power of PEV i at station j at time t. It is a continuous variable in the range

[0, Pmax,i] where Pmax is the maximum charging power.

The control variables are interdependent with following constraints:

Pch,i,j(t) ≤ yi,j(t)Pmax,i ∀i, j, t (3.4)

(ts,i − t)ui(t) ≤ 0 ∀i, t (3.5)

(t− te,i)ui(t) ≤ 0 ∀i, t (3.6)T∑t=1

ui(t) = te,i − ts,i + 1 ∀i (3.7)

Constraint (3.4) ensures Pch,i,j(t) = 0 if PEV i is not scheduled for charging at time t at station j.

Constraints (3.5)-(3.7) guarantee that ui(t) = 1 if t ∈ [ts,i, te,i] and ui(t) = 0 otherwise.

The parking convenience Si of PEV agent i comprises the walking distance convenience

Swk,i and the parking time convenience Spt,i. A larger Si indicates a service of higher quality is

provided to PEV agent i. Swk,i decreases with the increase of walking distance. Spt,i can be reduced

by adjusting the PEV agent parking interval from the preferred one. Since the walking distance and

parking time interval have different scales and units, Swk,i and Spt,i are normalized and included in

Si as:

Swk,i =

dmax,i −∑j∈Mi

xi,jdi,j + ε

dmax,i − dmin,i + ε

Spt,i =

t∗e,i∑t=t∗s,i

ui(t)−t∗s,i−1∑t=1

ui(t)−T∑

t=t∗e,i+1

ui(t)

t∗e,i − t∗s,i + 1

Si = αSwk,i + (1− α)Spt,i

(3.8)

where α ∈ (0, 1) is a weight coefficient. di,j is the walking distance from parking station j to

the destination of PEV agent i. dmax,i is the maximum di,j . ε is a small positive value ensuring

nonnegative denominator when dmax,i = dmin,i. [t∗s,i, t∗e,i] is the preferred parking interval of PEV

agent i. Swk,i is always a positive value. Spt,i is constrained nonnegative in the optimization.

The PEV monetary cost of an agent per convenience defines its cost function. The system

tries to schedule as many PEVs as possible. However, due to the limited station capacity, some

reservations have to be dropped to get feasible solutions. For each unscheduled PEV i, a constant

drop penalty PTi is included in the cost function, which is defined as:

PTi = Call,i/Si + ε (3.9)

36


where Call,i and Si are the maximum possible monetary cost and minimum satisfaction rate for the

PEV owner i, respectively, if it has been scheduled. ε is a positive constant.

The cost function for PEV agents is the sum of their cost per convenience plus the drop

penalty, which is to be minimized:

min CPEV =∑i∈N

[Cch,i + Cpk,i

Si+ (1−

∑j∈Mi

xi,j)PTi

](3.10)

where N is the set of all registered PEVs. Constraints for the scheduling include:

• Assignment and capacity: One PEV can be assigned to at most one charging station. At any

time, a station cannot be scheduled with PEVs more than its capacity.

• Service: Users’ battery charging requests should be satisfied if their cars are scheduled. Each

PEV agent also has a minimum acceptable convenience rate and maximum acceptable cost per

convenience.

• Charging and compensation: Chargers’ real and reactive power are constrained by their

maximum apparent power Smax. When a PEV is not scheduled for charging, its charging

power should be 0.

3.3.2 Optimizing Utility Grid Reactive Power Compensation

The objective for the utility grid is to minimize the total insufficiency of VAr reservoir.

For each charging station, the reactive power compensation requirement of its connected load at a

time is estimated from historical data. It can be either reactive power consumption (inductive and

positive) or injection (capacitive and negative). When the magnitude of reactive power compensation

is determined, chargers can be controlled for reactive power injection or consumption. The objective

is to minimize the total gap between requirement and compensation of all stations along the time as:

min Cutl =∑j∈M

T∑t=1

[Qreq,j(t)−

∑i∈Nj

Qi,j(t)

](3.11)

where M is the set of all stations. Nj is the set of PEV agents for whom the station j is within their

maximum walking distances. Qreq,j(t) is the magnitude of reactive power compensation requirement

at station j at time t. Qi,j(t) > 0 is a continuous control variable indicating the magnitude of

reactive power compensation from the charger of PEV i for Qreq,j(t). The optimization also includes

37


control variables appearing in the PEV side optimization. Besides the constraints in the PEV side

optimization, additional constraints include:

(Power Constraint) Q2i,j(t) + P 2

ch,i,j(t) ≤ S2max,i (3.12)

(Nonnegative Gap) Qreq,j(t)−∑i∈Nj

Qi,j(t) ≥ 0 (3.13)

(Power Availability) Qi,j(t) ≤ yi,j(t)Smax,i (3.14)

where Smax,i is the maximum apparent power of the charger in PEV i. Constraint (3.14) ensures that

Qi,j(t) = 0 when PEV i is not parked at station j at time t.

3.3.3 Multi-Objective Optimization Formulation

The multi-objective optimization problem is formulated as (3.15) considering both benefits

for PEV agents and the utility grid, and will be solved for control variable values under the constraints.

For (3.15), Pareto points (multiple optimal solutions) are solved as feasible solutions that do not

dominate each other, i.e., keeping the trade-off between multiple objectives.

min {CPEV, Cutl} (3.15)

s.t. all constraints for (3.10) and (3.11)

3.4 Multi-Objective Optimization Solution Approach

The solution approach of the multi-objective optimization (3.15) is shown in Fig. 3.3,

including linearization, problem reformulation by using NNC method, and problem solving by using

decentralized algorithm.

3.4.1 Problem linearization

The multi-objective optimization (3.15) is a mixed integer non-linear programming prob-

lem. It has non-linear cost function CPEV with fractional components. There also exist non-linear

constraints such as (3.3), (3.5), and (3.6) with bilinear terms, and the quadratic constraint (3.12). To

solve (3.15) efficiently, some techniques are used to reformulate it as a MILP problem.

The bilinear terms in constraints are products of a binary variable and an integer or

continuous variable, which can be linearized with the Glover’s linearization scheme [50]. The

constraint (3.12) is reformulated as Qi,j(t) ≤ g(Pch,i,j(t)) where g(Pch,i,j(t)) is the piecewise linear

38


Fractional Terms

Bilinear Terms

Quadratic Terms

PEV Side

Grid Side

MILP

MILP

PEV Side

Grid Side

Introduce extra variables

Glover’s linearization scheme

Piecewise linear approximation

Normalized normal constraint (NNC) method

Single-Objective

Pareto 1

Pareto m

Pareto 2

Solve Pareto pointsDecentralized Algorithm Based

on Lagrangian Decomposition

Nonlinear Terms Linearization

Multiple choice model

Anchor1 Anchor2

19

Figure 3.3: Solution approach diagram

approximation of√S2max,i − P 2

ch,i,j(t). g(Pch,i,j(t)) is presented with the multiple choice model

[51] in the optimization formulation. Since both Cch,i + Cpk,i and Si are linear functions with

continuous and integer variables, the fraction term (Cch,i + Cpk,i)/Si can be linearized with the

method proposed in [52]. New variables vi = 1/Si and zi,j(t) = Pch,i,j(t)/Si ∀i, j, t are introduced

and CPEV is reformulated as:

CPEV =∑i∈N

CPEV,i

=τ∑i∈N

{ ∑j∈Mi

T∑t=1

[Rch,j(t)zi,j(t) +Rpk,j(t)wi,j(t)

]+ (1−

∑j∈Mi

xi,j)PTi

}(3.16)

where wi,j(t) = yi,j(t)vi. wi,j(t) is introduced to present the bilinear term yi,j(t)vi after refor-

mulation according to Glover’s linearization scheme. With fraction, bilinear, and quadratic terms

reformulation, the cost function and related constraints in (3.15) are linearized and (3.15) becomes a

multi-objective MILP problem.

39


3.4.2 Normalized Normal Constraint Method

Since two objectives are included in the optimization, two anchor points and the Utopia

line are first determined in NNC method. Anchor points are special Pareto points in which only one

of the two objectives is optimized. The objective cost is then normalized according to values of these

anchor points. The line joining two anchor points is called Utopia line. To solve MP Pareto points,

the Utopia line is divided evenly with MP points. At each point on the Utopia line, the related Pareto

point is solved by minimizing one of the two objective costs with a normal line constraint added. The

NNC method transforms the multi-objective optimization to multiple single-objective optimizations

for Pareto points solving. Each transformed single-objective optimization is solved by decentralized

algorithm based on Lagrangian Decomposition for good scalability.

3.4.3 Decentralized Algorithm Based on Lagrangian Decomposition

3.4.3.1 Framework of decentralized algorithm

Decentralized optimization algorithms decompose a large scale problem to subproblems

with smaller scales and solve them simultaneously. Since each subproblem is much easier to solve,

decentralized algorithms are efficient for large scale complex optimizations. The decentralized

algorithm is designed with the framework shown in Fig. 3.4 for solving anchor points and Pareto

points. Three types of primal problems, denoted as PPEV, Putl, and PPm, are for finding two anchor

points and mth Pareto point, respectively. Generally, the number of PEVs is larger than the number

of stations in the scheduling problem. PPEV, Putl, and PPm are decomposed in terms of PEVs for

smaller scale subproblems which can be solved more efficiently. Any constraints coupled among

PEVs should be first relaxed, including the parking capacity constraints, the normal line constraint

introduced by the NNC method, and the nonnegative gap constraint (3.13). Take PPEV for example,

its Lagrangian relaxation LRPPEV has the following formation:

min∑i∈N

CPEV,i +∑j∈M

T∑t=1

λj,t

(∑i∈Nj

yi,j(t)−Acap,j

)(3.17)

where Acap,j is the capacity of station j. The minimization of (3.17) is subject to the same constraints

in the primal problem PPEV except for the relaxed station capacity constraint. λj,t is a non-negative

Lagrangian multiplier. LRPutl and LRPPm are formulated in similar ways.

40


The relaxed primal problems, LRPPEV, LRPutl, and LRPPm are then decomposed into

Lagrangian subproblems as LSPPEV,i, LSPutl,j , and LSPPm,i, respectively. After removing constant

terms, the subproblem LRPPEV,i for PEV ower i is:

min CPEV,i +∑j∈Mi

T∑t=1

λj,tyi,j(t) (3.18)

subject to the constraints in LRPPEV relating to PEV i. Similarly, formulations of LSPutl,j and

LSPPm,i can be determined. Each subproblem is a MILP problem and solved independently. With

constraints relaxation, solutions from subproblems may not be feasible for the primal problems.

These solutions form lower bounds (LBs) of the primal problems and need to be restored as feasible

solutions. Feasibility restoration heuristics are designed to recover solutions of subproblems and

obtain upper bounds (UBs) of primal problems. Lagrangian dual problems LDPPEV, LDPutl, and

LDPPm are solved to maximize the LBs. In each iteration, the algorithm tries to reduce the duality

gaps between the LBs and the UBs until the termination condition is satisfied. The Lagrangian dual

problem LDPPEV has the formulation as:

maxλ>0

CLRPPEV(λ) (3.19)

where CLRPPEV is the cost function of LRPPEV and λ is a vector of Lagrangian multipliers. Formula-

tions of LDPutl and LDPPm are determined in similar ways.

3.4.3.2 Subgradient search

Subgradient search is widely used to solve Lagrangian dual problems [53, 54]. It initiates

with selected multipliers and the multipliers are updated iteratively according to subgradients of

constraints, the lower bound, and the best found upper bound of the primal problem in each iteration.

In the rth iteration, αr > 0 is a scalar coefficient determining the step size of multiplier update. If the

low bound or the best upper bound does not improve for a number of iterations, αr+1 decreases with

a discount factor in the next iteration. The algorithm terminates in one of three conditions: the gap

between the best upper bound and current lower bound is smaller than a preset threshold, αr is less

than a small value, or the maximum iteration number is reached.

Take LDPPEV for example, the subgradient search algorithm is shown in Algorithm 3.

γrj,t and sr are subgradient and step size, respectively, in the rth iteration. y∗ri,j(t) is the optimal

PEVs’ parking status obtained by solving LSPPEV,i. CrLRPPEVis the optimal cost of LRPPEV solved

in rth iteration. It is the lower bound of PPEV. UB∗ is the best upper bound found by feasibility

41


LSPpev,1 LSPpev,i

Subgradient

SearchUpdate

Decentralized Optimization

Solutions

LB

Feasibility

Recovery

Lagrangian

Dual

UB

UB

RelaxPrimal

Ppev, Putl,

PPm

Lagrangian Relaxation

LRPpev, LRPutl,

LRPPm

DecomposeDerive

LDPpev, LDPutl,

LDPPm

Subproblem

LSPutl,1 LSPutl,i

LSPPm,1 LSPPm,i

...

...

...

Iteration

Figure 3.4: Framework of the decentralized optimization with Lagrangian relaxation

restoration. αr is a scalar satisfying αr > 0. If CrLRPPEVdoes not improve for a number of iterations,

αr+1 decreases with a discount factor in the next iteration. The algorithm terminates in one of three

conditions: the gap between UB∗ and CrLRPPEVis smaller than a preset threshold, αr is smaller than

a small value, or the maximum iteration number is reached.

Algorithm 3 Subgradient search for LDPPEV

1: Let λ0j,t = 0 ∀j, t;2: while termination condition is not satisfied do

3: Collect optimal solutions from all LSPPEV;

4: γrj,t =∑i∈Nj

y∗ri,j(t)−Acap,j ∀j, t;

5: sr = αrUB∗−Cr

LRPPEV(λ)∑

j∈M

T∑t=1

(γrj,t)2

;

6: λr+1j,t = max

{0, λrj,t + srγrj,t

}∀j, t;

7: Send λr+1j,t to all LSPPEV;

8: r ← r + 1;

9: end while

42


3.4.3.3 Feasibility restoration

Optimal solutions of Lagrangian subproblems are usually infeasible for the primal problems

and need to be restored with feasibility. The main idea is to check solutions of each subproblem one

by one in terms of PEVs or stations in a predefined order. Solutions from a subproblem are kept if

feasible. After feasibility checking, all subproblems with infeasible solutions are collected and their

solutions are adjusted to be feasible with heuristics. Take PPEV for example, the feasibility restoration

heuristics first orders PEV by increasing values of CPEV,i. The feasibility of each subproblem’s

solutions is checked according to this order. Subproblems with infeasible solutions are collected

and rescheduled later by adjusting charging station assignments or parking intervals. The heuristics

can improve solution qualities, i.e., reducing the objective cost of primal problems. The heuristic

feasibility restoration for the primal PPEV is shown in Algorithm 4.

Algorithm 4 Feasibility restoration for PPEV

1: Let NF = ∅ and ND = ∅;

2: Order all PEVs with increasing value of monetary cost per convenience plus drop penalty;

3: for each PEV ik in order do

4: if∑i∈NF

yi,j(t) + yik,j(t) ≤ Acap,j ∀j, t then

5: Accept the solution from LSPPEV,ik , NF = NF ∪ {ik};6: else

7: ND = ND ∪ {ik};8: end if

9: end for

10: Order the PEVs in IR with increasing value of scheduled parking interval tpk,i;

11: for each PEV i ∈ ND in order do

12: Gradually reduce tpk,i by increasing ts,i or decreasing te,i until it can be scheduled to one of

charging stations or further adjustment is not acceptable for the PEV owner;

13: if PEV i can be scheduled then

14: Accept the scheduling for PEV i;

15: else

16: Discard i without scheduling;

17: end if

18: end for

43


3.5 Results and Analysis

The designed algorithms are implemented with Java and MATLAB, and run on a desktop

with Intel i7-2600 CPU and 16GB RAM. Each subproblem is solved by CPLEX in TOMLAB

optimization toolbox[55]. The simulation time is an 11-hour period with one-hour time resolution.

In the simulation, a notional distribution grid is constructed on a part of Boston Back Bay region

shown in Fig. 3.5. It is a 2km × 0.8km commercial area with five types of buildings including

office buildings, apartment buildings, retail stores, restaurants, and storage buildings. Their load

demand and power factors are generated according to some load and power correction study reports

[56, 57, 58, 59]. The total real power load in the area is generated between 10MW and 20MW

along the day time. Power factors are generated between 0.90 and 0.97 according to the load types.

The area is configured with 4 garages and 48 on-street charging stations, each of which consists

of multiple charging spaces and adopts the AC level-2 charging standard. Day ahead time-of-use

(TOU) charging rate is applied to shave peak charging load through scheduling. Charging rates are

set between 0.18 $/kWh and 0.36 $/kWh with peak period between 3pm and 6pm. Hourly parking

prices are set to $2 and $4 for regular and busy streets, respectively. Destinations of PEVs are

generated randomly in the area. PEVs are equipped with 16kWh batteries and on-board chargers

with Smax = 9.6 kVA.

Substtation Low Voltage Load Node On-streeet Charging Sttation Parking Gar

rage

Figure 3.5: Simulation case setup for a distribution feeder and locations of charging stations

To evaluate the performance of proposed system, seven testing cases are simulated with

different configurations of PEV numbers (Npev), total garage charging capacities (Cgar), total on-

44


Table 3.1: Parking interval and station capacity configurations for different cases

Case Index Npev Cgar Cstr ηch Tstr β

1 250 40 240 30∼70% 2∼9h 0.4832 350 60 336 30∼70% 2∼9h 0.4803 450 100 432 30∼70% 2∼9h 0.4714 450 100 432 10∼30% 2∼9h 0.4715 350 60 336 30∼70% 2∼5h 0.3426 350 40 240 30∼70% 2∼9h 0.6617 450 40 240 30∼70% 2∼9h 0.840

street charging capacities (Cstr), battery charging requirement as the percentage of battery capacity

(ηch), and on-street parking intervals (Tstr), which are shown in Table. 3.1. For PEV agents with

office buildings as destinations, their parking intervals are relatively long and randomly generated

between 7 hours and 11 hours as regular parking intervals. PEVs with other destinations are

modeled with shorter and more flexible on-street parking intervals Tstr. The PEV density ratio

β = NpevT pk/[Tpd(Cgar + Cstr)] is defined as a metric to describe the PEV penetration level, where

T pkis the average PEV parking internal and Tpd = 11h is the simulation time period. Larger β

indicates charging stations are less sufficient and more conflicts may happen in PEV charging

reservation. Values of β in each case are listed in Table. 3.1. The PEV number in cases 1, 2, and

3 is scaled up with the similar β. In cases 6 and 7, charging station capacities are reduced and β

becomes larger. Case4 represents a scenario in which PEV agents have less charging demand. Case

5 is configured with shorter Tstr, which simulates a busy area with more frequent PEV turnaround.

In each study case, 10 Pareto points are solved. The maximum iteration number for solving

Pareto points is limited to 150. The duality gap (UB − LB)/LB × 100% is defined to evaluate

solution qualities. The iterations for solving an anchor point in case 2 is shown in Fig. 3.6. Along

iterations, the best LB and UB converge to steady values. Cases 2, 3, 6 are considered as examples.

Their solved normalized Pareto frontiers are shown in Fig. 3.7. The m is the index of Pareto points in

which m = 0 and m = 9 are the two anchor points optimizing grid compensation performance and

PEV agent benefits, respectively. As m increases, optimization for each PEV agent benefit gradually

weights more. Duality gaps of these Pareto points are shown in Fig. 3.8 and all below 5%, indicating

satisfying results for the complex MILP problem. The Pareto frontier given in Fig. 3.7 shows that

benefits of PEV agents and the utility grid are generally in conflict, especially for case 6 with larger

β. As m increases from 0 to 2, the objective cost on the PEV side decreases greatly with similar

grid compensation performance. Thus, optimizing only one objective will largely worsen the other

45


0 50 100 1500.6

0.8

1

1.2

1.4

1.6

1.8x 104

Iteration Step

Obj

ectiv

e C

ost o

n G

rid S

ide

Best Lower BoundBest Upper Bound

Duality Gap = 1.32%

Figure 3.6: Iterations of solving an anchor point in case 2

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Normalized PEV Objective Cost

Nor

mal

ized

Sta

tion

Obj

ectiv

e C

ost

Case2Case3Case6

m = 9

m = 0

Figure 3.7: Pareto optimal points solved in cases 2, 3, and 6

objective. By selecting appropriate Pareto points, benefits of PEV agents can be largely improved

without sacrificing much on the grid compensation. Among the obtained Pareto points, one can be

selected by an aggregator for the best grid compensation while satisfying PEV benefits.

The influence of parking patterns and station capacities on PEV agent benefits is further

analyzed for cases 2, 5, 6. Figs. 3.9 and 3.10 show the average unit cost per convenience Cu,pev of

scheduled PEVs and total PEV drop penalties PTtotal, respectively, with different amount of average

reactive power compensation. Among the three cases, case 2 can achieve the largest amount of

reactive power compensation because it has both longer average parking interval and larger station

capacity. With moderate reactive power compensation, i.e., between 1.0MVAr and 1.4MVAr, Cu,pev

in case 5 is smaller than that in case 2 when the reactive power compensation is the same. This is

because smaller Tstr in case 5 makes PEV scheduling less competitive and conflictive. However,

46


0 1 2 3 4 5 6 7 8 90

1

2

3

4

5

Pareto Point Index m

Dua

lity

Gap

(%)

Case2Case3Case6

Figure 3.8: Duality gaps of Pareto optimal points in cases 2, 3, and 6

Cu,pev increases greatly when reactive power compensation is highly demanded. The limitation of

station capacities in case 6 results in both larger Cu,pev and PTtotal, indicating more PEVs agent

benefits are affected or their charging request has been dropped.

Power analysis is further carried out on the distribution network in the seven cases with

load flow study [60]. The power loss ratio is defined as γploss = (Psub − Pload)/Psub, where Psub

indicates the real power measured in the substation and Pload is the total real power load demand.

The average power loss ratios γploss from 8AM to 18PM are calculated for the seven cases and shown

in Fig. 3.11. In each case, three charging schemes are studied. The first scheme only considers PEV

charging without providing reactive power compensation to the grid. In this case, PEV agent benefits

defined in (3.10) are optimized. The other two schemes correspond to two Pareto points with m = 0

andm = 9. The base γploss without PEV penetration is 0.00638 shown as the dashed line in Fig. 3.11.

Results show that γploss increases when PEVs are charged without reactive power compensation.

However, if reactive power compensation is provided and optimized, up to 9% reduction of γploss

is achieved by the Pareto point m = 0 in case 4. The exact γploss improvement is sensitive to the

number of PEVs, charging station capacities, and PEV charging demand. Among these factors, the

PEV charging demand influences γploss most. With smaller charging demand, more reactive power

can be provided by the PEV chargers during their parking. When the PEV number is increasing

while the total station capacity is kept the same, γploss first decreases and than converges to a steady

value, as reflected by the cases 1, 6, 7. This is because most extra PEVs in case 7 cannot be scheduled

for charging and parking due to the limitation of station capacity as well as will not affect the grid

power loss much. When both PEV number and total station capacity are scaled up with the similar β,

47


0.6 0.8 1 1.2 1.4 1.6 1.8 2.010

15

20

25

30

35

Average Reactive Power Compensation (MVAr)

Ave

rage

PEV

Uni

t Cos

t

Case2Case5Case6

Figure 3.9: Average scheduled PEV unit cost per convenience of Pareto points in 3 study cases

0.8 1 1.2 1.4 1.6 1.8 2 2.20

1000

2000

3000

Total Reactive Power Compensation (MVAr)

Tota

l PEV

Dro

p Pe

nalty

Case 2Case 5Case 6

Figure 3.10: Total PEV drop penalty of Pareto points in 3 study cases

the γploss first decreases and than increases, as shown in cases 1, 2, 3. This is because the increase of

PEV charging load causes more power loss, which affects the overall power loss ratio.

48


1 2 3 4 5 6 75.8

6

6.2

6.4

6.6

6.8x 10-3

Case Index

Ave

rage

Pow

er L

oss R

atio

PEV Charging OnlyPareto Point (m = 9)Pareto Point (m = 0)

Figure 3.11: Average power loss ratios of three charging schemes in the 7 test cases

49

Chapter 4

On-road PHEV Power Management in

Vehicular Networks


In existing offline PHEV power management systems, deterministic or stochastic optimiza-

tion problems are formulated to obtain optimal power strategies based on historical driving cycles.

These strategies will be used in future driving. Online systems make power decisions according to

real-time driving states. Many PHEV power management systems are based on the powertrain model

with PSD [61, 62, 63, 64]. PHEV fuel efficiency can be optimized through controlling the PSD gear

ratio. For offline PHEV power management, [65] and [66] utilize historical traffic cycles to optimize

the fuel consumption with dynamic programming (DP) in temporal and spatial domain, respectively.

In [66], the authors use a segment based road model to reduce computational complexity and obtain

a closed-form. Multiple information including the slope grade, speed, and acceleration/deceleration

are obtained from historical data for a selected route. However, no stochastic driving cycles or traffic

conditions are considered in the models for real drivings. In [67], the power management strategy is

represented by a pair of power parameters describing the threshold for ICE and battery power control.

The optimization problem is solved analytically. The solutions are optimal in a statistical sense but

not for an individual trip. [64] proposes a stochastic optimal control approach for PHEV power

management based on Markov decision process (MDP). However, the MDP is modeled with infinite

horizon and can hardly be applied to applications which are sensitive to the trip length, like trip fuel

consumption optimization. In summary, offline power management systems are usually limited by

50

CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS

historical driving cycles and cannot adapt to real driving conditions for optimal performance.

For online power management, systems are designed with various control methods. A

fuzzy controller is developed in [68] to determine the power output split between EM and ICE. Its

proposed baseline control strategy makes ICE work near optimal operation line and optimizes both

fuel consumption and emission. A rule-based supervisor equivalent fuel consumption control system

is proposed in [69] to minimize the equivalent fuel consumption. In [70], a model predictive control

system combined with statically solved power set points is designed to minimize fuel consumption

and emissions. [71] proposes a power-balancing strategy for a parallel PHEV. The ICE operation is

controlled in the peak-efficiency region. This strategy does not rely on a prior trip information and

can be easily applied for on-road application. Input to these online systems include the real-time

pedal position, battery SOC, and vehicle speed. Decisions are optimal torque/power splits between

ICE and EM according to real-time driving states. These techniques rely on analysis of PHEV

powertrain models, e.g., ICE and EM efficiency maps, integrated starter generator, and ICE optimal

operation line, without utilizing trip information such as driving routes and cycles. Thus, these

systems lack the overview of entire trips. Their power decisions are optimal for individual driving

states, but not for specific trips.

Few works study the integration of online and offline PHEV power management in the

context of vehicular network, where extra real-time information, i.e., vehicle speed prediction, is

available to be used for optimal on-road power management. Our proposed system leverages such

information in a two-level hierarchical power management scheme.

4.2 System Design

4.2.1 Overview of the System

The scheme for proposed on-road PHEV power management CPS is shown in Fig. 4.1. It

consists of smartphones and PHEVs’ powertrain as the physical part. The smartphone is capable

of wireless communication through embedded modules (WiFi, 3G, Bluetooth). It is also equipped

with GPS navigation system, accelerometer/gyroscope, and high capacity data storage. It serves

as a mobile in-situ vehicle state sensor, a communication device, and a computation unit running

power management algorithms. PHEV powertrain includes ICE, motors/generators (M/Gs), battery,

PSD, etc. The cyber system includes the vehicular network for traffic measurement and vehicle

speed prediction, power management algorithms, and real-time/historical driving information. It is

51


assumed that traffic information of urban arterials and freeways is retrieved from vehicular network

by smartphones. Then vehicle average speed prediction is made in real-time which will be utilized

by the power management system. The GPS navigation system is used to obtain the driving route

and location information. PHEV’s driving states, e.g., speed and acceleration, are measured by the

smartphone’s embedded sensors. These information is combined together to generate driving traces

and stored in smartphones for later modeling.

MDP Policy

Smartphone

Historical Driving Cycles

VANET

Traffic Prediction

Traffic Measurement

Power Management Algorithm

Speed Prediction Energy Budget

Offline

Online

Acquisition

Actuation

Status Update

Cyber System

Physical System

Wireless

Navigation Sensors

Storage

PHEV

ICE

M/G1

M/G2

Figure 4.1: Scheme of on-road PHEV power management system

The PHEV power management system is designed with two-level hierarchical optimiza-

tions, a high-level online and a low-level offline optimization to reduce the computational complexity

for on-road applications. To achieve minimum fuel consumption for a trip, the overall battery energy

consumption along the route should be first regulated. This may not be necessary for a short trip

when the battery energy is sufficient to sustain the entire trip. However, for mid or long-distance

driving, the limited battery energy should be well allocated, as energy budget, to each road according

to the varying average fuel efficiency determined by the vehicle driving conditions. When the

driving conditions result in low fuel efficiency, more battery energy should be discharged for M/G

torque generation rather than directly driven by ICE, and vice versa. These decisions should also

be updated in real-time to dynamically adapt to the battery state of charge (SOC), driving route

52


change, and average driving speed change. Thus, battery budgets are generated online with the

utilization of real-time vehicle speed prediction information. On the contrary, optimal powertrain

operation policies, i.e., the ICE speed and torque split ratio under different driving states, do not

change during driving because they are determined by physical characteristics of the powertrain.

Therefore, low-level powertain operation policies are generated offline based on historical driving

cycles. Plugging in battery budgets and real-time PHEV driving states, real-time power management

decisions are made by looking up the solved policy tables.

4.2.2 System Models

The PHEV power management CPS is based on three important parts: PHEV powertrain

with PSD, unit cycles in spatial domain, and power management algorithms. We next describe them

in detail.

4.2.2.1 PHEV Powertrain with PSD

The PHEV powertrain is configured with a PSD and will be used in the low-level power

management. The PHEV powertrain model diagram shown in Fig. 4.2 includes powertrain com-

ponents, power flows, and torque flows. The major powertrain components include an ICE, two

M/Gs, a planetary gear, an inverter and a battery pack. The two M/Gs differ in their sizes. M/G2

has a larger power output and provides traction torque to the car together with the ICE. Its another

function is to recharge the battery through regenerative braking. With a smaller scale, M/G1 works

as a power generator to drive M/G2 or charge the battery. As the PSD, the planetary gear connects

ICE, M/G1 and M/G2 and splits the ICE torque output TICE into two parts, TICE,1 and TICE,2.

TICE,1 is applied to M/G1 to generate electric power PM/G1. TICE,2 is applied directly to the final

drive shaft to meet PHEV’s torque demand Tfd together with M/G2 toque output TM/G2. PM/G1

is first provided to M/G2 for torque generation. If power demand PM/G2 of M/G2 for torque

generation is larger than PM/G1, extra power PB will be drawn from the battery. On the other hand,

if PM/G2 < PM/G1, the remaining part of PM/G1 will charge the battery.

The PSD enables the powertrain with two degrees of control freedom. ICE speed ωICE

and M/G2 torque generation TM/G2 are selected as control variables for the low-level optimization.

The following constraints from the powertrain model with PSD can be derived for the low-level

optimization formulation:

53


ICE M/G1

M/G2

Battery Fuel Tank

Final Drive Shaft

Fuel Input

Torque Flow Power Flow

Planetary Gear

ICET ,1ICET

BP

fdT,2ICET

,2ICET

,2

/ 2

ICE

M G

T

T

/ 1M GP

Figure 4.2: A PHEV model with PSD

TICE = (1 + ρ)(Tfd − TM/G2)

TICE,1 = ρ(Tfd − TM/G2)

ωM/G1 =ωICE(ρ+ 1)−Kωwh

ρ

ωM/G2 = Kωwh

(4.1)

where ρ = Ns/Nr. Ns and Nr are the teeth number of sun gear and ring gear of PSD, respectively.

ωM/G1, ωM/G2, and ωwh are the speed of M/G1, M/G2 and wheel (rad/s), respectively. The wheel

speed is determined by the vehicle speed. Tfd is the required torque on the final shaft for such speed.

K is the final drive ratio. Additional constraints include torque and speed limit of ICE, M/G1, and

M/G2, and the battery charging/discharging power limit. The ICE fuel consumption flow ˙fuel (g/s)

is included in the objective function as:

˙fuel =TICEωICEηICEHl

(4.2)

where Hl is the lower heating value of fuel (J/g). ηICE is the ICE fuel efficiency. ηICE has nonlinear

relationship with TICE and ωICE , and can be determined by looking up the fuel efficiency map. The

battery is approximated as a voltage source with an internal resistance. The change of battery SOC

through charging/discharging can be presented as:

˙SOC = − IbQb

= −Voc −

√V 2

oc − 4RbPb

2RbQb(4.3)

where Ib is the battery discharging/charging current (positive/negative value). Voc is the battery

open-circuit voltage. Rb is the battery internal resistance. Pb is the related battery power exchange.

54


Qb is the battery charge capacity. The detailed objective function formulation will be discussed in

Section 4.3.2.

4.2.2.2 Unit Cycles in Spatial Domain

In most existing PHEV power management systems, power decisions are generated in

the temporal domain (e.g., power splits from ICE and M/G at different time slots in a trip). It is

convenient for the modeling and performance analysis. However, it is hard to apply them for on-road

power management since travel time on each road is highly varying. First, the vehicle speed is

dynamic and affected by many elements in stochastic ways. Second, at each signal intersection, it

is difficult to retrieve or estimate the waiting time. Power management decisions in the temporal

domain can hardly match the real-time driving states. Alternatively, as far as the driving route is

given, the geographical topologies of traveling roads and the total driving distance will not change.

Therefore, PHEV power management in this paper is formulated in the spatial domain. Historical

driving cycles in the time domain are converted to the spatial domain for modeling. The speed is

recalculated as the average value in corresponding time slots.

A driving cycle consists of roads with different length. For the modeling convenience,

the whole driving cycle is decomposed into unit cycles, each of which represents an urban road

segment (arterial or local road) between two intersection or a segment of freeway, as shown in Fig

4.3. Because urban roads usually have different lengths, five typical lengths, from 0.1 to 0.5 mile

with corresponding length index from 1 to 5, are used to represent urban unit cycles. Unity cycles

shorter than 0.1 mile or longer than 0.5 mile are modeled with index 1 or 5, respectively. Affected

by intersections and traffic signals, driving speed characteristics on urban roads usually vary at

different locations. To model driving cycles more accurately, an urban unit cycle is divided to three

sections, A1 for departure, A2 for cruising, and A3 for arrival. A1 is the distance from the stop line

of upstream intersection to the location where cruising speed is usually achieved. Speeds in A1 have

large probability to transfer from low to high values. On the contrary, A3 presents the deceleration

section where PHEVs approach to the stop line of downstream intersection. Lengths of A1, A2, and

A3 are chosen according to historical driving cycles. Because freeways are not separated naturally

by intersections, the driving distance on a freeway is decomposed into N unit cycles, each of which

has fixed length of 0.2 mile. A freeway unit cycles will be one of three types, entering (acceleration,

U1), cruising (from U2 to UN−1), and exiting (deceleration, UN ). For an urban or freeway unit cycle,

each 0.1 mile length is further discretized into 20 slots.

55


Distance

Speed

Departure Arrival Cruising

1A 2A3A

(a) Urban unit cycle

Distance

Speed

Entering Existing Cruising

1U 2U 3U 1NU NU

(b) Freeway unit cycle

Figure 4.3: Unit cycle models for urban roads and freeway

4.3 HIERARCHICAL POWER MANAGEMENT ALGORITHMS AND

SOLUTIONS

4.3.1 Hierarchical Power Management Algorithms

The scheme of the proposed PHEV power management system is shown in Fig. 4.4. The

objective is to minimize total fuel consumption of the entire driving trip. The high-level management

allocates battery energy budgets to unit cycles according to real-time vehicle speed prediction.

These decisions are made online with updates when new prediction information becomes available.

Differently, low-level power management strategies give out optimal TM/G2 and ωICE according

to real-time driving states. Solving the low-level problem is computational difficult on account of

56


PHEV Real-time

State

MDP

Policy Battery

Energy

Budget

Historical

Driving Cycles

Unit Cycles with

Spatial Index

Real-time Vehicle

Speed Prediction

Quadratic Model

Stochastic Quadratic

Programming

PHEV Powertrain

Model

Markov Decision

Process

High-level Online Low-level Offline

Data

Model

Optimization

Strategy

Decision

* ( )efuel g P

/ 2,M G ICET

, reqV T

Figure 4.4: PHEV hierarchical mode for PHEV power management

nonlinear powertrain models and a large set of historical driving cycles. Thus, low-level strategies

are generated offline. At both two levels, strategies are made and executed through five layers.

The first layer is the data input layer. In the online mode, real-time traffic speed prediction

from vehicular network is used for battery energy budget generation. A driver’s future average speed

on a road is approximated as the same as the road’s average traffic speed prediction. It is assumed

that next 30-minute traffic speed predictions are available for freeways and urban arterials in the form

of expectations and distribution probabilities, which is reasonable according to recent research results

[72, 73]. For other roads without prediction, their historical average speeds and speed transition

probabilities are used. In the low-level offline mode, historical driving cycles are input data for

the PHEV powertrain modeling. From driving cycles, torque demand is calculated by considering

friction force, aerodynamic force, and acceleration force [74].

Input data are then applied to models in the second layer. The online mode uses a quadratic

model to simplify the nonlinear relationship between optimal achievable average fuel rate ˙fuel∗

and

57


battery budget power Pe (budget energy divided by travel time in the unit cycle) for a unit cycle. It is

formulated as ˙fuel∗

= a2P2e + a1Pe + a0 where a2, a1, and a0 are coefficients to be determined

through curve fitting. The quadratic model is a trade-off between accuracy and computational

complexity [74]. Samples for quadratic model fitting, i.e., ˙fuel∗

and Pe pairs, are solved offline

from historical driving cycles by using DP. Quadratic models are then fit separately for different

types of unit cycles, i.e., with different length indices and average speed, from DP solutions in a least

square sense. The low-level offline models include the PHEV powertrain model and the spatial unit

cycle model.

With both data and models, power optimizations are formulated in the third layer. The high-

level is formulated as a multi-stage stochastic quadratic programming (MSQP) to generate battery

energy budgets for unit cycles in a trip. With energy budgets as constraints, the low-level problem is

formed as the finite-horizon MDP and the MDP policies for TM/G2 and ωICE are generated offline.

Finally, during on-road driving, ωICE and TM/G2 decisions are looked up from MDP policy tables

according to real-time generated battery budgets, driving states, and road length indices.

4.3.2 Optimization Formulation and Solutions

With the two-level power management system, both the high-level online and the low-

level offline power managements are formulated as optimization problems and solved by efficient

algorithms.

4.3.2.1 Online Stochastic Quadratic Programming for Battery Budget Generation

In the high-level online power optimization, a stage is defined as a unit cycle. Future

traffic speeds are random variables. We can obtain their prediction but not full information until

their realizations, i.e., a PHEV is entering the next unit cycle and its traffic speed is measured

instantaneously. Thus, the optimization should be formulated with probabilistic descriptions of traffic

speeds, e.g., probability distributions and densities, to incorporate their effects on optimal budget

decisions. As a PHEV is driving, the high-level power management is done through sequential

decisions in multiple stages from the start to the end of a trip. The diagram of the stochastic

optimization is shown in Fig. 4.5. When a new traffic speed is disclosed, it is desired to use its

realization to generate energy budgets for future unit cycles. The reaction to the realization of random

variables for future decisions is called recourse. Thus, the high-level power management problem

can be described as: at the end of stage k, given current battery SOC, traffic speed measurement of

58


stage k+ 1, and vehicle speed prediction of future stages in the trip, decide the battery energy budget

for stage k + 1 in order to minimize PHEV’s fuel consumption in stage k + 1 plus the expected total

fuel consumption in other remaining stages. The generated battery energy budget for stage k+ 1 will

then be applied to the low-level power management as the constraint. We observe that the problem

has the following features: 1)The problem has a convex and polyhedron constraint space; 2)The cost

function is quadratic and continuous; 3) Traffic speed prediction is assumed to be obtained from

a vehicular network. Thus, it is appropriate to model this optimization problem as a MSQP with

recourse. Existing mathematic programming techniques for MSQP can solve the problem with global

optimal solutions and with fast speed.

Stage k

Traffic

Prediction

Battery Capacity

Energy Budgets

Stage Update

Constraint

Stochastic

Process

Stochastic Optimization

Optimal

Solution

s Minimize

Input

𝑘 ← 𝑘 + 1

Figure 4.5: Diagram of stochastic programming for online PHEV power management

The distribution of stochastic vehicle speed, obtained from vehicle speed prediction, is

presented with a finite number of scenarios. To solve the problem online, the selection of scenario

number has to consider the trade-off between the solution quality and computational complexity.

Short-term speed prediction, i.e., within next 10 minutes, usually has small root mean square error

(RMSE). So the number of scenarios in a stage is set to one by only using the predicted expectation.

For the long-term prediction between next 10 to 30 minutes, three scenarios are used to present the

two-sigma range of its probability. For MSQP with stage k as the first stage, the control variable is

the battery energy budget EBk. The stochastic average speed is represented as a random variable Vk.

The MSQP is formulated as:

min zk(EBk, vk) = Tkgvk,il,k(EBk/Tk)

+ EVk+1|vk[Qk(EBk, Vk+1)

] (4.4)

59


whereEBk = (EB1, EB2, ..., EBk)

vk = (v1, v2, ..., vk)(4.5)

EBk and vk denote the sequence of budget decisions and realization of random variables V up to

stage k, respectively. gvk,il,k(EBk/Tk) is the optimal achievable fuel consumption rate in stage k

with length index il, traffic speed vk, and battery power budget EBk/Tk. It is calculated from the

quadratic fuel consumption rate models. Tk = dk/vk is the travel time in stage k and dk is the length

of the unit cycle. Qk(EBk, Vk+1) =min zk+1(EBk+1, vk+1). Its conditional expectation given

V k = vk is denoted as EVk+1|vk[Qk(EBk, Vk+1)

]. Constraints include that after allocating budget

EBk to stage k, the remaining battery SOC should be higher than the minimum value. The MSQP

described in (4.4) is solved with global optimal solutions by the quadratic nested decomposition

algorithm [75, 76]. This algorithm evolves from a Newton-type method for solving piecewise

quadratic programming.

The MSQP described in (4.4) is solved by the quadratic nested decomposition algorithm

[75, 76]. This algorithm evolves from a Newton-type method for solving piecewise quadratic

programming. Three assumptions should be satisfied to solve a MSQP with this algorithm: 1) The

number of scenarios in each stage is finite; 2) Control variables have polyhedral convex sets; 3) The

quadratic term of the cost function in each stage is positive semidefinite for all scenarios. When

satisfying these assumptions, the algorithm terminates in a finite number of iterations by obtaining

global optimal solutions or detecting unbounded solutions. Our MSQP formation (4.4) satisfies all

these requirements and can be solved with the algorithm.

4.3.2.2 Offline PHEV Power Policy Generation

In the low-level offline power management, PHEV power policies are solved for unit cycles

by using historical driving data. PHEV power policies map driving stages to optimal ωICE and

TM/G2 decisions. Optimal ωICE and TM/G2 decisions are sensitive to the lengths of unit cycles,

vehicle average speed, and battery budget allocation. Thus, policies are differentiated for unit cycles

with different length indices (from 1 to 5 for urban unit cycles) and speed level (low or high, with

average speed 25 MPH and 50 MPH as thresholds for urban and freeway unit cycles, respectively).

Because a PHEV’s driving speed profiles in a unit cycle are different in different trips, the power

policies minimize its expected fuel consumption in the unit cycle.

60


The driving speed and torque demand of a PHEV in one time slot is mainly determined

by the driver’s pedal/throttle command and vehicle speed in the previous time slot. Thus, both the

speed and torque requirement are modeled as Markov chains. The problem of solving the optimal

low-level power policies for unit cycles is modeled as a finite-horizon MDP. The MDP can be

described as, given the battery energy budget constraint and vehicle speed level, finding the optimal

M/G2 torque and ICE speed policy to minimize the total expected fuel consumption in the unit cycle.

ICE’s on/off state optimization is not included in the MDP for simplicity. Instead, it is controlled

by rule based strategies, i.e., ICE is turned off after a period of car waiting. An MDP stage is a

road slot reflecting its spatial granularity. An MDP state xk in stage k includes PHEV’s speed vk

, torque demand on the wheel Tw,k, and the remaining energy budget percentage qk. An action ak

incorporates M/G2’s torque output TM/G2,k and ICE’s speed ωICE,k. State transition probabilities

p(x, x′) = Pr(xk+1 = x′|xk = x) are differentiated for low and high speed level and learned by

using the maximum likelihood estimation method [77]. The reward rk(xk, ak) for action a in stage

k is designed as the negative fuel consumption. The MDP maximizes total expected rewards with the

policy π:

Eπ

{ N∑k=1

rk(x, a)

}(4.6)

wherexk = (vk, Tw,k, qk)

ak = (TM/G2,k, ωICE,k)

rk(xk, ak) =ωICE,kLsηICEHlvk

(1 + ρ)(Tfd,k − TM/G2,k)

(4.7)

Ls is the length of a stage. The finite-horizon MDP is usually solved with the backward induction

method [78]. In the last stage N , the reward rN (x, a) is maximized. From stage N − 1 to the first

stage, in each stage k, the value function Vk(x) is computed with optimal action a as:

Vk(x) = maxa∈A{rk(x, a) +

∑x′∈X

p(x, x′)Vk+1(x′)} (4.8)

where X and A are the state and action space, respectively. πk,il,j,EB(x) is the power management

policy generated for stage k in the unit cycle with length index il, speed level j (0 or 1 for high or

low level), and battery budget EB. According to vehicle’s driving stage xk, power decisions are

looked up from policy tables as ak = πk,il,j,EB(xk).

61


Table 4.1: Configuration of PHEV Powertrain with PSD

Vehicle Total Mass 1486 kgICE Maximum Power 43kW @ 4000 rpm

Peak Torque 101.7 N*m @ 4000 rpmIdling Speed 1200 rpm

M/G1 Maximum Power 15 kWPeak Torque 55 N*m

@ -2500 rpm∼2500 rpmMax Speed ±6000 rpm

M/G2 Maximum Power 30 kWPeak Torque 305.0 N*m

@ 0∼940 rpmMax Speed 6000 rpm

Battery Capacity 6Ah × 308V

4.4 RESULTS AND ANALYSIS

The simulation platform is built with Java, Matlab, and ADVISOR simulator [79], where

the power management algorithms are implemented with Java and Matlab, and ADVISOR is a

common vehicle simulation software [79]. The Toyota Prius powertrain parameters are obtained from

ADVISOR and shown in Table 4.1. The proposed power management method is built and validated

on eight standard driving cycles in ADVISOR. Seven of them, including HWFET, INRETS, LA92,

NYCC, SC03, SC06, and UNIF01, are used for learning the speed transition probabilities and fitting

the quadratic models. The remaining UDDS cycle is used for the power management performance

evaluation. To simulate the scenario of on-road driving, stochastic driving cycles are generated

based on UDDS and speed transition probabilities, and adjusted according to future vehicle speed

prediction. The vehicle speed prediction is presented in the form of speed probability density function

(pdf). Without loss of generality, it is assumed that the pdf has the normal distribution[80, 81], where

the prediction result and RMSE are used as the estimation of the distribution mean and standard

deviation, respectively. The vehicle speed scenario probabilities required in the high-level MSQP are

calculated from the vehicle speed prediction models.

For system evaluation, we analyze system modeling results, performance of power de-

cisions in a randomly selected driving cycle, and fuel consumptions in different test cases. First,

low-level models learned from the seven historical driving cycle are checked. Speed transition

probabilities of urban road driving in different sections are shown in Fig. 4.7. Each grid represents

62


0 2 4 6 80

10

20

30

40

50

60

Driving Distance (Mile)

Spee

d (M

PH)

(a) UDDS

0 2 4 6 80

10

20

30

40

50

60

70


Spee

d (M

PH)

(b) Stochastic driving cycle

Figure 4.6: UDDS driving cycle and a sample of generated stochastic driving cycle in spatial domain

a speed transition instance from current stage to next stage in the spatial domain. Speed transition

characteristics are different in the three sections. In departure and arrival sections, most transition in-

stances are above and below the diagonal, respectively. In the cruising section, speeds are maintained

stable with high probabilities and transition instances locate around the diagonal. The UDDS driving

cycle and a sample of stochastic driving cycles in the spatial domain are shown in Fig. 4.6a and

Fig. 4.6b, respectively. Fig. 4.8 shows examples of M/G2 torque and ICE speed policies generated

from MDP. Vveh and Tfd indicate vehicle’s speed and torque requirement on the final drive shaft,

respectively. αM/G2 is the ratio of M/G2 torque output to Tfd. ωICE is the ICE speed normalized to

the maximum ICE speed. Due to the limited number of driving cycles for training, some states are

not covered and their related αM/G2 and ωICE are assigned with negative values. For these states,

their policies are approximated to their adjacent neighbors. If policies of their neighbors are also

63


(a) Transition probabilities for vehicle departure (b) Transition probabilities for vehicle cruising

(c) Transition probabilities for vehicle arrival

Figure 4.7: Speed transition probabilities for urban roads with light traffic

unavailable, charging-depleting/charging-sustaining (CDCS) strategy will be applied. In the CDCS

strategy, M/G2 generates torques and supplies to the PHEV demand if the battery SOC is sufficient

(charging depleting). As the battery SOC level decreases to a low-level, both ICE and M/G2 provide

torques and the battery SOC in maintained within a preferred range (charging sustaining). Take

the M/G2 torque policy shown in Fig. 4.8a as the example, α is large in regions with low torque

demand and high driving speed because of the high M/G2 efficiency. When the driving speed is low,

a larger part of torque demand is generated by ICE. Because the M/G2’s rotary speed is the same as

the driving speed and its efficiency decreases as its rotary speed decreases, battery energy is saved

for future usage when M/G2’s efficiency is high. Similarly, the ICE speed policy in Fig. 4.8b gives

ωICE decisions for the maximum fuel efficiency.

64


For the low-level power management evaluation, the expected fuel consumption in a unit

cycle with the MDP policy, as −V0(x) defined in (4.8), is compared with that with CDCS strategy.

Results are shown in Fig. 4.9. When the battery energy budget is small, MDP policy is more

fuel-efficient than CDCS. As the battery energy budget becomes large enough to sustain M/G2’s

torque generation in the whole unit cycle, CDCS has the same performance as the MDP policy.

Vehicle speed prediction accuracy and available battery energy for discharging are two

important elements affecting power management decisions in the proposed system. Vehicle speed

prediction accuracy and available battery energy for discharging are two important factors affecting

power management decisions in the proposed system. To evaluate the performance of the proposed

two-level power management systems, five cases are tested with different configurations of vehicle

speed prediction and initial battery SOCs. Three cases have the same initial battery SOC 0.7 but

different vehicle speed prediction errors with RMSE 3 MPH, 5 MPH, and 8 MPH, respectively. The

other two are assumed with vehicle speed prediction RMSE= 5 MPH and start with the battery

initial SOC of 0.9 and 0.5, respectively. 10 stochastic driving cycles are tested in each case. The

performance of our proposed method, denoted as the MSQP/MDP, is compared with other four

methods. The second method only utilizes the vehicle speed prediction expectation in the high-level

problem without considering its distribution information. In this way, the high-level optimization

is simplified as a quadratic programming (QP). This method is denoted as the QP/MDP. In the

third method, the battery budget generation is not optimized. Instead, the low-level MDP model

is provided with battery budget as large as possible according to the battery SOC. This method is

called MDP Only. The fourth method is the CDCS. The last method is denoted as Static, which

solves power management decisions offline based on the UDDS cycle with DP and then applies

decisions to testing driving cycles. Even though UDDS and the testing cycles have the same route,

static decisions can not be always applied to the testing cycles because their driving states may be

different. For example, at the same location in UDDS and a testing cycle, PHEV may decelerate

without torque output in the former but require torque generation for acceleration in the latter. In

these situations, CDCS method is applied instead.

The high-level online MSQP in the proposed MSQP/MDP method can be solved within 3

seconds. A test sample is selected randomly and its results of torque generation and SOC profile are

studied in detail. The torque requirement on the final drive shaft is shown in Fig. 4.10. The torque

outputs of ICE and M/G2 and the fuel consumption alone driving distance with MSQP/MDP and

CDCS are shown in Fig. 4.11. With the CDCS strategy, less fuel is consumed at the beginning of

the driving cycle and more torque is generated from M/G2. However, the battery is depleted fast

65


(a) M/G2 torque policy map

(b) ICE speed policy map

Figure 4.8: MDP policy maps for an urban unit cycle with length index il = 2, stage index k = 15,

light traffic, and remaining battery budget 0.015 kWh

and fuel should be consumed for torque generation in the remaining driving cycle, even though the

fuel efficiency is low. The total fuel consumption of CDCS is larger than that of MSQP/MDP. In

our proposed method, torque outputs of ICE and M/G2 are well balanced to minimize the total fuel

66


0 0.02 0.04 0.06 0.08 0.1 0.12 0.142

4

6

8

10

12

14

16

Battery Energy Budget (kWh)

Expe

cted

Tot

al F

uel C

onsu

mpt

ion

(gra

m)

MDPCDCS

Figure 4.9: Expected fuel consumption of an urban unit cycle (Length index il = 2) with MDP and

CDCS.

consumption in the trip. The battery SOC profiles in the test sample with the four methods are shown

in Fig. 4.12. Different from the CDCS method, the proposed MSQP/MDP has a good battery energy

scheduling along the driving cycle.

We further check the ICE operation points on the ICE efficiency map, which is shown

in Fig. 4.13. Different fuel efficiency levels, e.g., from 0.15 to 0.4, are shown as contours in Fig.

4.13. The ICE optimal operation line is defined as a set of operation points which consume the

lowest fuel and provide a constant power output [82]. As shown in Fig. 4.13, operation points of the

proposed MSQP/MDP are close to the optimal operation line, which means high fuel efficiencies

are achieved with less fuel consumption. On the contrary, operation points in CDCS are widely

distributed in the ICE fuel efficiency map. This is because the battery is depleted in early stages with

the CDCS method and ICE has to consumes fuel and work in regions with low fuel efficiency in

order to generate enough torque.

Average fuel consumptions are further compared between the four methods in the five

cases and results are shown in Fig. 4.14. In each case, the average fuel consumption of 10 tests

is calculated and normalized to the cost of MSQP/MDP. The proposed MSQP/MDP outperforms

other methods in all cases in terms of fuel consumption while the CDCS method has the worst

performance. Fig. 4.14a shows that differences of fuel consumption between MSQP/MDP and

QP/MDP become larger as the prediction RMSE increases. This is because when the vehicle speed

prediction is inaccurate with larger RMSE, only using prediction expectation in the high-level budget

generation is not enough to solve optimal decisions. On the other hand, with less accurate prediction,

i.e., RMSE=8 MPH, the fuel consumption difference between MSQP/MDP and static method is

67


0 2 4 6 8-400

-200

0

200

400


Torq

ue (N

*m)

Figure 4.10: Required torque on the final drive shaft

small. This is because the utilization of inaccurate prediction in high-level management cannot

generate optimal energy budgets and won’t improve the system performance significantly. Fig. 4.14b

shows that the difference of fuel consumptions between MSQP/MDP, Static, and CDCS becomes

smaller as the initial battery SOC increases. This indicates that the management does not contribute

much to fuel reduction when battery energy is sufficient and CDCS is near optimal. As the initial

SOC increases, the performance of QP/MDP decreases gradually and is outperformed by the MDP

Only. This shows that the high-level decision sub-optimality can be amplified by the large amount of

available battery energy. The QP/MDP method fails to schedule battery budgets optimally and leaves

much battery unused at the end of the trip.

68


0 2 4 6 80

50

100


(a) ICE torque output (MSQP/MDP)

0 2 4 6 80

50

100


(b) ICE torque output (CDCS)

0 2 4 6 8-300

-150

0

150

300


(c) EM torque output (MSQP/MDP)

2 4 6 8-300

-150

0

150

300


(d) EM torque output (CDCS)

0 2 4 6 80

0.5

1

1.5

2


(e) Fuel consumption (MSQP/MDP)

0 2 4 6 80

0.5

1

1.5

2


(f) Fuel consumption (CDCS)

Figure 4.11: ICE and EM torque output and fuel consumption along distance in MSQP/MDP and

CDCS

69


0 2 4 6 80.2

0.3

0.4

0.5

0.6

0.7

0.8


SOC

MSQP/MDP QP/MDP Static CDCS MDP Only

1. 2.

1.

2.

3.

4.

5.

3.5.4.

Figure 4.12: Battery SOC along driving distance in a sample test driving cycle

OPL × 2-level MSQP/MDP OP ○ CDCS OP

Figure 4.13: Operation points on ICE efficiency map

70


(a) Fuel consumption comparison with initial SOC 0.7 and different vehicle speed

prediction RMSE

(b) Fuel consumption comparison with vehicle speed prediction RMSE 5 MPH

and different initial battery SOC

Figure 4.14: Fuel consumption comparison

71

Chapter 5

Traffic and Vehicle Speed Prediction in

Vehicular Networks


Traffic predictions include the prediction of traffic flow, average traffic speed of selected

roads, and average travel time on selected routes. Predictions are more important for freeways

and arterials which have large flow and speed variation in a day or at the same time for different

days. Traffic prediction methods can be categorized into two types: model based prediction and data

driven prediction. Model based prediction uses traffic models, such as vehicular density, vehicular

flow and individual vehicle trajectories [83, 84, 85, 86, 87], to describe future traffic conditions.

These methods require complex computation and extensive on-site calibration and are difficult to

implement. On the other hand, data-driven methods rely much on the input traffic data and try to find

the relationship between future data and historical one. They are easy for implementation and can be

adaptive to changing traffic conditions. Existing data-driven traffic prediction models are based on

historical average, time series analysis, NN, and nonparametric regression.

In [88], the authors compare the above four models in freeway traffic flow prediction.

Historical average is the simplest model for implementation but has the largest average absolute

error. The autoregressive integrated moving average (ARIMA) model, as a widely used time series

model for prediction, can be easily implemented based on existing techniques. But it is hard to

handle missing values and is just slightly better than the historical average model. NN has the

second best results and is suitable for nonlinear relationship prediction, which, however, requires a

72

CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS

complex training process. The nonparametric regression with nearest neighbor formulation has the

best performance with smallest average absolute error and its error is well distributed. It highly relies

on the quality of the used database and recognizing neighbors is also complex. Work [89] combines

ARIMA with Generalized Autoregressive Conditional Heteroscedasticity (ARIMA-GARCH) to deal

with non-constant conditional variance in one-step (5-15 minute) freeway traffic flow prediction.

Results show ARIMA-GARCH provides additional information, i.e., the time-variant confidence

interval, and reaches similar prediction accuracy as ARIMA. Work in [90] proposes NN with road

clustering for traffic speed prediction. The NN training time is reduced by utilizing the correlation

in clusters. Results show that the proposed method gives more accurate prediction than time-series

methods and the binary NN method proposed in [91]. Different prediction intervals (from 5 minutes

to 30 minutes) and two traffic conditions (congestion and non-congestion) are explored in [92] with

NN. For individual vehicle speed prediction, work in [93] uses a constant percentile to predict the

vehicle speed, which assumes drivers have their preferred speed at different locations and tend to stay

the same for each drive. This method suffers low accuracy since the influence from traffic conditions

on vehicle speed is not considered. Work in [94] proposes a vehicle cruising speed prediction method

based on non-parametric kernel density estimation (KDE) and parameterized launching models. The

prediction system is designed with low complexity, but it can only predict vehicle’s speed 20 seconds

ahead and is limited to specific road types.

5.2 System Description

It is assumed that all vehicles studied in this system are connected through on-board

smartphone communications in a vehicular network. Their driving data are measured in real time

by smartphone embedded sensors and stored in memory cards. Driving data are uploaded to the

cloud regularly and aggregated there to calculate the real-time traffic speed and flow information.

On account of the significant effect of traffic condition on vehicle speed, the accuracy of vehicle

speed prediction can be greatly improved by utilizing future traffic conditions. Thus, the vehicle

speed prediction system is designed as a two-level scheme shown in Fig. 5.1. The first-level system

predicts the traffic speed down the road with NN models remotely in cloud servers. Road segments

are targeted for prediction according to vehicles’ driving routes. NN is effective to represent complex

nonlinear relationships between different statistics, e.g., the traffic speed relationship between one

target road segment and its neighbors.

Beside traffic speed, individual vehicle speeds can also be affected by other factors,

73


including vehicle type, road type, and lane selection/change. Some driving states are unobservable,

i.e., the vehicle’s lane selection/change. A vehicle’s lane selection/change is determined by the

driver’s preference or as the reaction to the real-time traffic events. The vehicle lane detection relies

on additional devices and precise localization techniques [95], which are not available for common

vehicles. To deal with the unobservable states, HMM is selected as a suitable model to establish the

relationship between vehicle speeds and different sates and predict the vehicle speed. An HMM is a

stochastic Markov model where the observation is a probabilistic function of the hidden states [96].

Although hidden states are not directly observable, they have physical meanings in the vehicle speed

prediction application and can be deduced from the observation sequence. A hidden state represents

the joint distribution between the traffic speed and vehicle speed on a road segment in an emission

function. Because different types of vehicles, e.g., sedan and SUV, have different vehicle mobilities,

an HMM is built separately for a specific type of vehicle for accurate modeling and prediction. When

a vehicle enters the road segment k, the vehicle speed data on all previous k − 1 road segments as

well as the traffic speed prediction for all remaining road segments are used by HMMs to predict the

vehicle speed on the remaining road segments.

!!

Result

Traffic Prediction

Data Upload

Cloud

Historical Data

Real-time Data Prediction

Training

NN

Historical Cycle

Real-time Data

Training

Prediction Diving Route

HMM Historical Traffic

Vehicle Speed Prediction

Figure 5.1: Scheme of the 2-level vehicle speed prediction system

5.3 Vehicle Speed Prediction System Design

In this section, we will elaborate on the two-level prediction system design based on NN

and HMM.

74


5.3.1 Traffic Speed Prediction with NN

Traffic speeds of a target road segment in future time horizons are predicted based on the

current and historical data of itself and its neighbor road segments. In our work, the traffic speed

prediction starts from 7AM for the morning rush hours with the time resolution (prediction period)

∆t. At time t, traffic speeds of the road segment i in future n periods (n-period ahead) are to be

predicted by the nonlinear function gi(·):

vi(t+ n∆t) = gi

({vi(t), ..., vi(t−mi∆t)

},{fi(t), ...,

fi(t−m′i∆t)},{vnb,i(t), ...,vnb,i(t−mnb,i∆t)

},{

fnb,i(t), ..., fnb,i(t−m′nb,i∆t)})

(5.1)

Prediction input date can be categorized into four groups. The first group{vi(t), ..., vi(t−mi∆t)

}is the target road’s historical traffic speed, where mi is the length of previous data utilized. Similarly,

the second group{fi(t), ..., fi(t −m′i∆t)

}is the historical flow of target road. Here the flow of

the road segment i in time period t is defined as the total number of vehicles entering the road

segment with in that period of time (vehicle count/∆t). The third and fourth group are historical

data of target road’s neighbor roads, where vnb and fnb indicate the vector of traffic speed and flow,

respectively. The size of each data group is constrained to reduce the training complexity. With the

size constraint, the neighbor roads and the length of historical data are selected according to the traffic

data correlations with the target road. The interdependence relation gi(·) is to be learned by NN. NN

is a statistical model with neurons and sets of adaptive weights to approximate non-linear functions

of their input. An NN can be described as a weighted directed graph where artificial neurons are

nodes and weighted directed edges are connections between neuron outputs and neuron inputs [97].

A neuron takes input xi with individual weights wi plus a bias b and apply them to an activation

function f to generate the output y, which is shown in Fig. 5.2. An activation function can be a step,

piecewise linear , sigmoid, or Gaussian function. NNs can be categorized into feed-forward networks

and feedback networks. In feed-forward networks, there are no loops in NN graphs. Different, loops

and feedback connections occur in the feedback networks. The NN model is selected with nonlinear

autoregressive network with exogenous inputs (NARX) and feedback connections, as shown in

Fig. 5.3, because the inputs include dependent signals relating to the target road i and exogenous

(independent) signals of road i’s neighbors. The NN mode is designed with one hidden layer and

30 log-sigmoid neurons. After the NN model is trained with historical traffic data, it takes vnb, fnb,

and vi as input to predict road i’s traffic speed in the future time. When a vehicle is driving along

75


the route, its travel time to following roads are first estimated according to the vehicle current speed.

Then the traffic speed prediction result of each road segment at the time upon the vehicle arrival will

be fetched accordingly.

𝑤1

𝑤2

𝑤3

𝑤4

𝑤5

Σ 𝑓 𝑏

𝑥1

𝑥2

𝑥3

𝑥4

𝑥5

𝑦

Neuron

Figure 5.2: Diagram of a NN neuron... ...

Input Layer Hidden Layer Output Layer

Current and

Historical Traffic

Flow and Speed

Future Traffic

Speed Prediction

30 Neurons

Figure 5.3: NARX NN Model for Traffic Speed Prediction

76


5.3.2 Vehicle Speed Prediction with HMM

5.3.2.1 HMM Design and Training

The HMM for vehicle speed prediction is designed with a left-to-right structure, as shown

in Fig. 5.4, to represent the driving from the start point to the destination. An HMM is called

left-to-right if and only if the hidden state transitions do not include loops. In the HMM, a stage

represents a road segment. Each circle in Fig. 5.4 represents a hidden state. We denote qk as the

hidden state in stage k, which is in the state set {Si :Nk−1+1 ≤ i ≤ Nk}. Hidden states are traversed

over stages when the vehicle is driving. The observation of the HMM in stage k is denoted as ok. An

HMM can be fully described by three parameters as λ = (A,B, π), where A is the state transition

distribution, B is the emission probability distribution, and π denotes the initial state distribution.

Each stage observation is the realization of the bivariate random variableOk = (Vtf,k, Vvh,k), where

Vtf,k and Vvh,k represent the traffic and vehicle speed in stage k, respectively. After a vehicle passes a

stage k, ok is recorded. The state transition distribution matrix is defined as A = {aij}, where

ai,j = P [qk+1 = Sj |qk = Si], Nk−1 + 1 ≤ i ≤ Nk

Nk + 1 ≤ j ≤ Nk+1 (5.2)

ai,j is the probability of transferring from Si in stage k to Sj in stage k+ 1. Since the speeds in each

observation are continuous values, B = bi(ok) is defined as the conditional joint probability density

of observation random variables given hidden state Si. Gaussian mixture models [96, 98] are used to

construct bi(ok) as:

bi(ok) =M∑m=1

ci,mG(ok,µi,m,Σi,m) (5.3)

where M is the number of Gaussian mixture components. ci,m is the mixture coefficient for the

mth mixture in state i. G(ok,µi,m,Σi,m) is a bivariate Gaussian density with mean vector µi,m and

covariance matrix Σi,m. π = {πi, 1 ≤ i ≤ N1} defines the probability of the initial hidden state in

the first stage.

The HMM should be well trained before prediction, i.e., determining the number of hidden

states in each stage, A, B, and π. The optimal number of hidden states can be selected according

to Akaike’s information criterion (AIC) or Bayesian information criterion (BIC) on a maximum

likelihood basis [99, 100], which are defined as:

AIC = −2 logL+ 2p

BIC = −2 logL+ p log T (5.4)

77


…

…

…

…

…

Road 1 Road 2 Road 3 Road K…

Driving Route

1S

2S

1NS

1 1NS

1 2NS

2NS

2 1NS

2 2NS

3NS

1 1KNS

1 2KNS

KNS

1 1( )b ο1 1 2( )Nb ο

2 1 3( )Nb ο1 1( )

KN Kb ο

11, 1Na 1 21, 1N Na

2 1( )b ο1 2 2( )Nb ο

2 2 3( )Nb ο1 2 ( )

KN Kb ο

1 1( )Nb ο2 2( )Nb ο

3 3( )Nb ο ( )KN Kb ο

Figure 5.4: Left-to-right HMM for vehicle speed prediction

where logL is the log-likelihood of the fitted model. p is the number of parameters of the model,

which takes account of the initial probability πi, the transition probability ai,j , as well as cj,m, ok,

µj,m, and Σj,m in (5.3). T is the number of samples used for training. The lower AIC and BIC

values are, the better models are fit. AIC and BIC are both penalized likelihood criteria. Their main

difference is that BIC has a larger penalty term than AIC when the sample size is large, which is true

in many cases. AIC and BIC may risk at selecting a too large or small size of model, respectively.

AIC works better to overcome the underfitting with a small sample set, while BIC is preferred for

a case with a large sample set to prevent overfitting [101]. In this work, both AIC and BIC are

used and their results will be compared. HMM configurations with the smallest AIC or BIC will

be selected for the vehicle speed prediction. An HMM is trained with historical traffic and vehicles

data by Baum-Welch algorithm [102]. Since the number of stages is large, each stage is assumed

with the same number of hidden states Q to reduce the complexity of model selection. The number

of Gaussian mixture components is also set the same for each hidden state. The HMM training

targets on finding the optimal (Q,M) configuration and HMM parameters. The training procedure

includes three steps are: first, (Q,M) configurations are initialized with different values. Second,

for each (Q,M) configuration, the related HMM is trained by the Baum-Welch algorithm and the

AIC and BIC values are calculated. Finally, two configurations with smallest AIC and BIC values,

respectively, are selected and their HMMs are used for vehicle speed prediction and evaluation.

78


5.3.2.2 Prediction algorithm

As a vehicle is in stage k∗, observations in all previous k∗ − 1 stages is denoted as a

sequence in the vector ok∗−1 = (o1, o2, ..., ok∗−1). The vehicle speed is predicted as the conditional

expectation of vehicle speed in the stage k(k ≥ k∗), given the vector ok∗−1 and traffic prediction:

vk∗−1

vh,k (vtf,k) = E[Vvh,k|Vtf,k = vtf,k,ok∗−1], k ≥ k∗ (5.5)

where vtf,k is the traffic speed measurement or prediction information for the road segment k at

the time when the vehicle arrives at it. The vehicle’s travel time from stage k∗ to k is estimated

according to the vehicle current speed and traffic speed information. The gap ∆k between prediction

stage k and the last observation stage k∗ − 1 is the prediction ahead step. vk∗−1

vh,k (vtf,k) is calculated

from the conditional joint pdf of the vehicle speed and traffic speed given the observation sequence

f(vvh,k, vtf,k|ok∗−1, λ)(k ≥ k∗). To get this conditional joint pdf, two internal statistics need

to be determined, including the scaled forward probability αk∗−1(i) = P (qk∗−1 = Si|ok∗−1, λ)

and the probabilities of following hidden states given the previous observation sequence P (qk =

Sj |ok∗−1, λ)(k ≥ k∗).

The scaled forward probabilityαk∗−1(i) is calculated by the forward-backward algorithm[103,

104]. For the stage k∗, P (qk∗ = Sj |ok∗−1, λ) is calculated based on state transition probabilities as:

P (qk∗= Sj |ok∗−1, λ) =∑

i∈Ik∗−1

aijP (qk∗−1 = Si|ok∗−1, λ) (5.6)

where Ik∗−1 is the set of state index belonging to stage k∗ − 1.

Similarly, the probabilities of following hidden states P (qk = Sj |ok∗ , λ) can be calculated

for k > k∗. The conditional joint pdf of the vehicle speed and traffic speed given the observation

sequence can be calculated as:

f(vvh,k, vtf,k|ok∗−1, λ)=∑i∈Ik

bi(ok)P (qk = Si|ok∗−1, λ) (5.7)

vvh,k is thus calculated as:

vk∗−1

vh,k (vtf,k) =

∫ vmaxvh

0vvh,kf(vvh,k|vtf,k,ok∗−1, λ)dvvh,k

=

∫ vmaxvh

0vvh,k

f(vvh,k, vtf,k|ok∗−1, λ)

fVtf,k(vtf,k)dvvh,k (5.8)

where vmaxvh is the maximum vehicle speed. fVtf,k(vtf,k) is the marginal density of traffic speed in

stage k. fVtf,k(vtf,k) is approximated as the Gaussian distribution with the traffic speed prediction

result as mean and prediction root mean square error (RMSE) as standard deviation.

79


5.4 Road Network and Simulation Setup

The prediction system is evaluated on the Luxembourg motorway network during morning

rush hours between 7AM and 10AM. Luxembourg has the highest densities of motorways in Europe

and is with complex traffic. Floating car data, including real-time speeds and driving routes, are

necessary input for the prediction system. Existing Luxembourg ITS systems, e.g., Ponts et Chaussees

traffic monitoring system [105], only provide volume counts or traffic speed at specific locations.

Thus, to simulate floating car data, microscopic traffic simulation is carried out on SUMO with the

VehiLux vehicular mobility model [106, 107, 108]. VehiLux generates vehicle driving traces based

on real traffic volume counts and Luxembourg GIS map. The Luxembourg road network in SUMO

is shown in Fig. 5.5. The procedure of the simulation data preparation is shown in Fig. 5.6. First,

daily traffic count data are downloaded from the Ponts et Chaussees traffic monitoring system by

a Python script. Second, based on the volume count data, the VehiLux model simulates the traffic

demand and generates vehicle route data with the Dijkstra algorithm followed by augmentation with

the Gawron’s dynamic route assignment algorithm [109]. Finally, vehicle route data are provided to

SUMO for simulation where traffic and vehicle driving data are parsed from the simulation result.

Figure 5.5: Luxembourg road network in SUMO

A part of A3 motorway (south to north direction) with the distance of 6.5km in the

Luxembourg network is selected as the driving route for the prediction evaluation, which is shown

in Fig. 5.7. The route is composed of 12 road segments in the VehiLux model. To detect vehicles’

instant driving speed, senors are placed every 50 meters from the beginning of each road segment.

80


Ponts et

Chaussees

Traffic

Count

VehiLux

Model

Vehicle

Traces

SUMO

Simulator Traffic and

Vehicle Data

Download Configuration

Route Assignment

Input Parse

Figure 5.6: Procedure of data preparation for traffic prediction based on simulation

The vehicle speed on a road segment is calculated as the average of speeds captured by all sensors

on the road segment. The traffic speed on a road segment within a time interval is calculated by

averaging speeds of all passing vehicles. Five types of vehicles are considered for the vehicle trace

generation. They are configured with different length, maximum speed, and the Krauss car following

model. We simulate the morning rush hour traffic on every weekday between March and December

in 2010 and store results in XML files. Java programs are developed to parse the vehicle speed, traffic

speed, and traffic flow from the XML files. The MATLAB NN and HMM toolboxes [110, 111] are

used for model training and prediction. Eq. (5.1) is learned by NN for each road segment with the

traffic prediction period ∆t = 3 min. Future traffic speeds of each road segment up to 5-period

ahead are to be predicted. NN models are trained with traffic data from March to November. Data in

December are used for traffic prediction verification. For HMM training, 3000 vehicle traces are

selected randomly between the March and November data set. Another 2000 traces in December are

used for vehicle speed prediction verification.

5.5 Result and Analysis

We first evaluate the traffic speed prediction performance on all road segments in the route.

The one period ahead prediction result of a randomly selected road segment is shown in Fig. 5.8 with

the prediction root mean square error (RMSE) 2.434m/s. RMSEs of all road segments with different

prediction ahead periods are shown in Fig. 5.9. Although RMSEs become larger as the number of

ahead periods increases, they are smaller than 3.5m/s and show satisfying prediction accuracy.

To evaluate vehicle speed prediction performance, we randomly pick one type of vehicle,

i.e., “twingo”. AIC and BIC values are checked for HMMs with different (Q,M) configurations and

shown in Fig. 5.10a and Fig. 5.10b, respectively. (Q=3,M =7) and (Q=2,M =5) are selected

as the optimal configurations because they give the smallest AIC and BIC values, respectively.

81


Figure 5.7: Road set for prediction in Luxembourg motorway network

0 50 100 150 200 250 300

20

25

30

35

Time Slot

Tra

ffic

Spe

ed

ObservationPrediction

Figure 5.8: Traffic prediction result for road segment #7 with one prediction period ahead

HMMs with these two configurations are trained by the Baum-Welch algorithm and will be used

for vehicle speed prediction. A trained HMM with (Q = 3,M = 7) for a random selected road

82


0 2 4 6 8 10 120

1

2

3

4

Road Index

RM

SE(m

/s)

1-Pread Ahead3-Period Ahead5-Period Ahead

Figure 5.9: Traffic speed prediction RMSE of all road segments

segment is checked. Samples generated by the HMM are compared with the simulation observations

(not used for the HMM training) as shown in Fig. 5.11a and 5.11b. Results show that HMM can

effectively reproduce the relationship between traffic speed and vehicle speed, which is important for

the accurate vehicle speed prediction.

We denote our proposed method as NN/HMM. For performance evaluation, NN/HMM is

compared with another two methods: the traffic speed approximation (TSAP) and the KDE combined

with NN (NN/KDE) similar to the method in [94]. TSAP simply approximates the individual vehicle

speed as the traffic speed predicted by NN. In NN/KDE, the conditional pdf of the vehicle speed

given traffic speed on each road segment is learned by using KDE. The vehicle speed is predicted as

the conditional expectation according to the pdf and the traffic speed prediction. Fig. 5.12 and Fig.

5.13 show RMSEs and mean absolute percentage errors (MAPEs), respectively, of the vehicle speed

prediction with these three methods, where NN/HMM is evaluated with one prediction step ahead

(∆k = 1). Results show that NN/HMM (Q=3, M=7) outperforms the others while the TSAP has

the worst accuracy. Compared with TSAP and NN/KDE, NN/HMM (Q=3, M=7) reduces RMSE

by 45.1% and 18.2% on average, respectively. We also evaluate influence of the prediction ahead

step ∆k on the prediction accuracy. Fig. 5.14 shows that the prediction RMSE increases as∆k

becomes larger. When ∆k < 7, NN/HMM (Q = 3,M = 7)outperforms NN/KDE for most road

segments. Finally, we check the prediction absolute errors of 2000 individual vehicles on a selected

road segment with the histogram and pdf plotted in Fig. 5.15. The 98.7th percentile absolute error is

1m/s which shows satisfying prediction accuracy.

83


2 3 4 5 6 7 8 9

Number of Gaussian Components (M)

AIC

●

●

●

●●

● ● ●

1.0

1.1

1.2

1.3

1.4×105

● Q=1Q=2

Q=3Q=4

Q=5Q=6

(a) AIC values with different configurations

2 3 4 5 6 7 8 9

Number of Gaussian Components (M)

BIC

●●

● ● ● ● ● ●

1.0

1.5

2.0

2.5

3.0×105

● Q=1Q=2

Q=3Q=4

Q=5Q=6

(b) BIC values with different configurations

Figure 5.10: AIC and BIC values for HMMs with different (Q, M) configurations

84


5 10 15 20 25 30 350

5

10

15

20

25

30

35

Traffic Speed (m/s)

Veh

icle

Spe

ed (m

/s)

(a) Samples generated by HMM

0 5 10 15 20 25 30 350

5

10

15

20

25

30

35

Traffic Speed (m/s)

Veh

icle

Spe

ed (m

/s)

(b) Simulation observations

Figure 5.11: Comparison between HMM sampling and simulation observation for one road segment

0 2 4 6 8 10 120

2

4

6

8

10

Road Index

RM

SE(m

/s)

TSAPNN/KDENN/HMM(Q=3, M=7)NN/HMM(Q=2, M=4)

Figure 5.12: Vehicle speed prediction RMSE of TSAP, NN/KDE and NN/HMM (∆k = 1)

85


0 2 4 6 8 10 120

5

10

15

20

25

Road Index

MA

PE(%

)

TSAPNN/KDENN/HMM(Q=3, M=7)NN/HMM(Q=2, M=4)

Figure 5.13: Vehicle speed prediction MAPE of NN, NN/KDE and NN/HMM (∆k = 1)

0 2 4 6 8 10 120.5

1

1.5

2

2.5

3

3.5

Road Index

RM

SE(m

/s)

NN/HMM(k=1)NN/HMM(k=3)NN/HMM(k=5)NN/HMM(k=7)NN/KDE

Figure 5.14: Vehicle speed prediction RMSE of NN/KDE and NN/HMM with different ∆k

86


0.0 0.5 1.0 1.5 2.0Prediction Absolute Difference

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Figure 5.15: Histogram and pdf of vehicle speed prediction absolute error for road segment #5

87

Chapter 6

Conclusion and Future Research

This dissertation focuses on the optimization and management system design for cost-

effective and energy-efficient CPS, including the design of an energy ecosystem with hierarchical

optimization, V2G system for reactive power compensation, on-road PHEV power management

system, and vehicle speed prediction in vehicular network. We first propose an energy ecosystem

with DR and DER management. DR considers the updated external information and user’s task

preference in decision making. Preference functions, modeled with KDE, are used to describe user’s

task preference. With two-level shared cost-led management, µCHPs are fully utilized to reduce

the energy consumption cost of the whole community. At last, VRB management with Q-learning

obtains the optimal discharging policy considering the utility price and stochastic elements of wind

power and load demand. Simulation results show the great effectiveness of this management system

on the energy consumption cost reduction.

Because PEVs and PHEVs play significant roles in smart grid CPS, we then study their

V2G applications. In smart city with smart grid, travel information of each PEV agent can be

collected from cellular devices and be used in optimal applications, so as to resolve service conflicts,

augment PEV agent benefits, and enhance the performance of power distribution network. With

on-board bidirectional AC chargers, PEVs are utilized as mobile and distributed VAr resources

for reactive power compensation to the grid. We propose an optimal scheduling scheme of PEV

parking and charging as the responsibility of a PEV aggregator. PEVs are scheduled after their

reservations, i.e., charging requirement and parking preference, are received by the aggregator. The

scheduling problem is formulated as a multi-objective optimization problem considering both PEV

agent benefits and the grid compensation performance. The original non-linear optimization problems

are reformulated to MILP problems for efficient solving. With Lagrangian decomposition and NNC

88

CHAPTER 6. CONCLUSION AND FUTURE RESEARCH

method, Pareto points are solved in a decentralized way and the approach is scalable in terms of the

number of PEVs and charging stations. Simulation results from seven test cases show satisfying

solution quality and the effects of different aspects on PEV benefits and reactive power compensation

performance. The trade-off between the two objectives is analyzed in detail. The study of load flow

analysis demonstrates effectiveness of the proposed V2G system on power loss reduction.

Besides smart grid power management and support, we also study on the economical

operation of PHEVs, i.e., optimal power management. We propose an on-road hierarchical power

management CPS for PHEVs in vehicular networks. Driving cycles are decomposed and modeled

with unit cycles in the spatial domain. The high-level online battery budget generation is formulated

as a MSQP problem and solved with the nested decomposition algorithm. M/G2 torque and ICE

speed policies are generated offline at the low-level. The low-level management is formulated as

a finite-horizon MDP and solved with backward induction method. During driving, the high-level

battery energy budgets are generated in real-time. According to the battery energy budgets, power

management decisions are made by looking up the policy tables. Simulation results for five different

cases show that the proposed 2-level MSQP/MDP method can utilize the vehicle speed prediction

information and adapt to the stochastic real-time driving states so as to minimize the fuel consumption

for the entire trip.

Vehicle speed prediction serves as an important input for our on-road PHEV power

management systems. At last, a novel two-level vehicle speed prediction system for highway network

is proposed based on NN and HMM. According to vehicle driving routes, the traffic speed of target

road segments is first predicted in the first-level by NN with historical traffic data. In the second

level, the statistical relationship between traffic speed and vehicle speed is modeled by HMM on

account of the existence of unobservable states. The proposed method is compared with two other

methods, including traffic speed approximation and KDE method. Results show that our proposed

method outperforms the others in terms of prediction accuracy.

There are several directions for our future research. First, our existing system models

can be further enhanced according to real system characteristics. For instance, in our V2G reactive

power compensation system, the effect of random PEV parking and charging (without reservation)

on existing scheduling should be considered. By introducing this stochastic element, the scheduling

system should be augmented with algorithms dealing with possible conflicts and adjusting the

scheduling online. Second, our PHEV power management work and the vehicle speed prediction

work should be integrated and evaluated as a whole since the latter servers as the input for the former.

Constrained by limited on-board computation resources, the integration requires faster algorithms so

89

CHAPTER 6. CONCLUSION AND FUTURE RESEARCH

that the on-road computation requirement can be achieved. Finally, our works are currently evaluated

and verified based on simulations. System evaluation based on real data from power and vehicle

industry is another important future work.

90

Bibliography

[1] M. Albadi and E. El-Saadany, “Demand response in electricity markets: An overview,” in

Proc. IEEE Power Engineering Society General Meeting, June 2007.

[2] California Center for Sustainable Energy, Plug-in Electric Vehicles (PEVs), Available:

http://energycenter.org/index.php/technical-assistance/transportation/electric-vehicles.

[3] B. Kramer, S. Chakraborty, and B. Kroposki, “A review of plug-in vehicles and vehicle-to-grid

capability,” in Proc. IEEE Annual Conf. on Industrial Electronics, Nov. 2008, pp. 2278–2283.

[4] T. Markel, M. Kuss, and M. Simpson, “Value of plug-in vehicle grid support operation,” in

Proc. IEEE Innovative Technologies for an Efficient & Reliable Electricity Supply, Sept. 2010,

pp. 325–332.

[5] P. Vovos, A. Kiprakis, A. Wallace, and G. Harrison, “Centralized and distributed voltage

control: Impact on distributed generation penetration,” IEEE Trans. Power Systems, vol. 22,

no. 1, pp. 476–483, Feb. 2007.

[6] S. Bolognani and S. Zampieri, “A distributed control strategy for reactive power compensation

in smart microgrids,” ArXiv e-prints, 2011.

[7] S.-Y. Lee, C.-J. Wu, and W.-N. Chang, “A compact control algorithm for reactive

power compensation and load balancing with static Var compensator,” Electric

Power Systems Research, vol. 58, no. 2, pp. 63–70, 2001. [Online]. Available:

http://www.sciencedirect.com/science/article/pii/S0378779601001274

[8] M. Kisacikoglu, B. Ozpineci, and L. Tolbert, “Examination of a PHEV bidirectional charger

system for V2G reactive power compensation,” in in Proc. IEEE Applied Power Electronics

Conference and Exposition, Feb. 2010, pp. 458–465.

91

http://www.sciencedirect.com/science/article/pii/S0378779601001274

BIBLIOGRAPHY

[9] A. Messac, A. Ismail-Yahaya, and C. Mattson, “The normalized normal constraint method

for generating the pareto frontier,” Structural and Multidisciplinary Optimization, vol. 25, pp.

86–98, 2003.

[10] P. Luh, L. Michel, P. Friedland, C. Guan, and Y. Wang, “Load forecasting and demand

response,” in Proc. IEEE Power and Energy Society General Meeting, July 2010.

[11] M. Miller, K. Griendling, and D. Mavris, “Exploring human factors effects in the smart grid

system of systems demand response,” in Proc. Int. Conf. System of Systems Engineering, 2012,

pp. 1–6.

[12] D. Livengood and R. Larsen, “The energy box: Locally automated optimal control of residen-

tial electricity usage,” Service Science, vol. 1, no. 1, pp. 1–16, 2009.

[13] D. O’Neill, M. Levorato, A. Goldsmith, and U. Mitra, “Residential demand response using

reinforcement learning,” in Proc. IEEE Int. Conf. Smart Grid Communications, Oct. 2010, pp.

409–414.

[14] S. Ramchurn, P. Vytelingum, A. Rogers, and N. Jennings, “Agent-based control for decen-

tralised demand side management in the smart grid,” in Proc. Int. Conf. Autonomous Agents

and Multiagent Systems, Feb. 2011, pp. 5–12.

[15] W. Gu, Z. Wu, and X. Yuan, “Microgrid economic optimal operation of the combined heat

and power system with renewable energy,” in Proc. IEEE Power and Energy Society General

Meeting, July 2010, pp. 1–6.

[16] A. Rogers, S. Maleki, S. Ghosh, and N. R. Jennings, “Adaptive home heating control through

gaussian process prediction and mathematical programming,” in Int. Workshop Agent Technol-

ogy for Energy Systems, May 2011, pp. 71–78.

[17] X. Guan, Z. Xu, and Q.-S. Jia, “Energy-efficient buildings facilitated by microgrid,” IEEE

Trans. Smart Grid, vol. 1, no. 3, pp. 243–252, Dec. 2010.

[18] Y. Zhang, N. Gatsis, and G. Giannakis, “Robust energy management for microgrids with

high-penetration renewables,” IEEE Trans. Sustainable Energy, vol. PP, no. 99, pp. 1–10,

2013.

[19] A. Peacock and M. Newborough, “Impact of micro-CHP systems on domestic sector CO2

emissions,” Applied Thermal Engineering, vol. 25, no. 17-18, pp. 2653–2676, 2005.

92

BIBLIOGRAPHY

[20] A. Hawkes and M. Leach, “Cost-effective operating strategy for residential micro-combined

heat and power,” Energy, vol. 32, no. 5, pp. 71–723, 2007.

[21] M. Houwing, R. Negenborn, and B. De Schutter, “Demand response with micro-chp systems,”

Proc. IEEE, vol. 99, no. 1, pp. 200–213, Jan. 2011.

[22] U.S. Environmental Protection Agency, New York Net-Metering Rules , Available:

http://www.epa.gov/chp/policies/policies/ nenewyorknetmeteringrules.html.

[23] L. Barote, R. Weissbach, R. Teodorescu, C. Marinescu, and M. Cirstea, “Stand-alone wind

system with vanadium redox battery energy storage,” in Proc. Int. Conf. Optimization of

Electrical and Electronic Equipment, May 2008, pp. 407–412.

[24] W. Wang, B. Ge, D. Bi, and D. Sun, “Grid-connected wind farm power control using VRB-

based energy storage system,” in Proc. IEEE Energy Conversion Congress and Exposition,

Sept. 2010, pp. 3772–3777.

[25] J. Chahwan, C. Abbey, and G. Joos, “VRB modelling for the study of output terminal voltages,

internal losses and performance,” in Proc. IEEE Electrical Power Conference, Oct. 2007, pp.

387–392.

[26] B. Silverman, Density Estimation for Statistics and Data Analysis. New York: Chapman and

Hall, 1986.

[27] F. Zenith and S. Skogestad, “Control of fuel cell power output,” Journal of Process Control,

vol. 17, no. 4, pp. 333–347, 2007.

[28] P. Zhao, H. Zhang, H. Zhou, J. Chen, S. Gao, and B. Yi, “Characteristics and performance of

10kw class all-vanadium redox-flow battery stack,” Journal of Power Sources, vol. 162, no. 2,

pp. 1416 – 1420, 2006.

[29] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural

Networks, vol. 4, Nov 1995, pp. 1942–1948.

[30] R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proc. Int.

Symposium Micro Machine and Human Science, Oct 1995, pp. 39–43.

[31] K. E. Parsopoulos and M. N. Vrahatis, Particle Swarm Optimization and Intelligence: Ad-

vances and Applications, 2010.

93

BIBLIOGRAPHY

[32] X. Hu and R. Eberhart, “Solving constrained nonlinear optimization problems with parti-

cle swarm optimization,” in Proc. World Multiconference on Systemics, Cybernetics and

Informatics, 2002, pp. 203–206.

[33] E. Laskari, K. Parsopoulos, and M. Vrahatis, “Particle swarm optimization for integer pro-

gramming,” in Proc. Evolutionary Computation, vol. 2, 2002, pp. 1582–1587.

[34] Y. del Valle, G. Venayagamoorthy, S. Mohagheghi, J.-C. Hernandez, and R. Harley, “Particle

swarm optimization: Basic concepts, variants and applications in power systems,” IEEE Trans.

Evolutionary Computation, vol. 12, no. 2, pp. 171–195, April 2008.

[35] Y. Yare, G. Venayagamoorthy, and U. Aliyu, “Optimal generator maintenance scheduling

using a modified discrete PSO,” IET Journal Generation, Transmission and Distribution,,

vol. 2, no. 6, pp. 834 –846, Nov 2008.

[36] J.-M. Yang, Y.-P. Chen, J.-T. Horng, and C.-Y. Kao, “Applying family competition to evolution

strategies for constrained optimization,” in Evolutionary Programming VI, ser. Lecture Notes

in Computer Science, P. Angeline, R. Reynolds, J. McDonnell, and R. Eberhart, Eds. Springer

Berlin / Heidelberg, 1997, vol. 1213, pp. 201–211.

[37] J. Kennedy and R. Eberhart, “A discrete binary version of the particle swarm algorithm,” in

Proc. IEEE Int. Conf. Systems, Man, and Cybernetics, vol. 5, Oct 1997, pp. 4104–4108.

[38] C. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, 1992.

[39] A. Faruqui and S. Sergici, “Household response to dynamic pricing of electricity: a survey of

15 experiments,” J. of Regulatory Economics, vol. 38, no. 2, pp. 193–225, 2010.

[40] Jacobs 31-20 (20kw) Complete System Pricing, Available: http://www.windturbine.net/

documents/Pricing/wtic retail pricing 2012.01.pdf.

[41] Operational and Maintenance Costs for Wind Turbines, Available: http://www.

windmeasurementinternational.com/wind-turbines/om-turbines.php.

[42] VRB Flow Battery Demonstration, Available: http://apps1.eere.energy.gov/tribalenergy/pdfs/

wind akres04.pdf.

[43] M. Yilmaz and P. Krein, “Review of benefits and challenges of vehicle-to-grid technology,” in

Proc. IEEE Energy Conversion Congress and Exposition (ECCE), 2012, pp. 3082–3089.

94

http://www.windturbine.net/documents/Pricing/wtic_retail_pricing_2012.01.pdf

http://www.windturbine.net/documents/Pricing/wtic_retail_pricing_2012.01.pdf

http://www.windmeasurementinternational.com/wind-turbines/om-turbines.php

http://www.windmeasurementinternational.com/wind-turbines/om-turbines.php

http://apps1.eere.energy.gov/tribalenergy/pdfs/wind_akres04.pdf

http://apps1.eere.energy.gov/tribalenergy/pdfs/wind_akres04.pdf

BIBLIOGRAPHY

[44] E. Sortomme and M. El-Sharkawi, “Optimal scheduling of vehicle-to-grid energy and ancillary

services,” IEEE Trans. Smart Grid, vol. 3, no. 1, pp. 351–359, 2012.

[45] A. Lam, K.-C. Leung, and V. Li, “Capacity management of vehicle-to-grid system for power

regulation services,” in Proc. IEEE Int. Conf. on Smart Grid Communications, 2012, pp.

442–447.

[46] K. Shimizu, T. Masuta, Y. Ota, and A. Yokoyama, “Load frequency control in power system

using vehicle-to-grid system considering the customer convenience of electric vehicles,” in

Proc. Int. Conf. on Power System Technology, 2010, pp. 1–8.

[47] M. Kisacikoglu, B. Ozpineci, and L. Tolbert, “Effects of V2G reactive power compensation

on the component selection in an ev or phev bidirectional charger,” in Proc. IEEE Energy

Conversion Congress and Exposition, 2010, pp. 870–876.

[48] C. Wu, H. Mohsenian-Rad, and J. Huang, “PEV-based reactive power compensation for wind

DG units: A stackelberg game approach,” in Proc. IEEE Int. Conf. Smart Grid Communica-

tions, 2012, pp. 504–509.

[49] C. Wu, H. Mohsenian-Rad, J. Huang, and J. Jatskevich, “PEV-based combined frequency and

voltage regulation for smart grid,” in Proc. IEEE PES Innovative Smart Grid Technologies,

2012, pp. 1–6.

[50] F. Glover, “Improved linear integer programming formulations of nonlinear integer problems,”

Management Science, vol. 22, no. 4, pp. 455–460, 1975.

[51] A. Balakrishnan and S. C. Graves, “A composite algorithm for a concave-cost network

flow problem,” Networks, vol. 19, no. 2, pp. 175–202, 1989. [Online]. Available:

http://dx.doi.org/10.1002/net.3230190202

[52] D. Yue, G. Guilln-Goslbez, and F. You, “Global optimization of large-scale mixed-integer

linear fractional programming problems: A reformulation-linearization method and process

scheduling applications,” AIChE Journal, vol. 59, no. 11, pp. 4255–4272, 2013. [Online].

Available: http://dx.doi.org/10.1002/aic.14185

[53] H. D. Sherali and O. Ulular, “A primal-dual conjugate subgradient algorithm for specially

structured linear and convex programming problems,” Applied Mathematics and Optimization,

vol. 20, pp. 193–221, 1989. [Online]. Available: http://dx.doi.org/10.1007/BF01447654

95

http://dx.doi.org/10.1002/net.3230190202

http://dx.doi.org/10.1002/aic.14185

http://dx.doi.org/10.1007/BF01447654

BIBLIOGRAPHY

[54] E. Kutanoglu and S. Wu, “On combinatorial auction and lagrangean relaxation for distributed

resource scheduling,” IIE Trans., vol. 31, no. 9, pp. 813–826, 1999.

[55] TOMLAB Optimization in MATLAB, Available: http://tomopt.com/tomlab/.

[56] U.S. Environmental Protection Agency, Sector Collaborative on

Energy Efficiency Accomplishments and Next Steps, Available:

http://www.epa.gov/cleanenergy/documents/suca/sector collaborative.pdf.

[57] Norman Disney & Young, Power Factor Correction Evaluation, Available:

http://www.abcb.gov.au.

[58] S. Lin, D. Salles, W. Freitas, and W. Xu, “An intelligent control strategy for power factor

compensation on distorted low voltage power systems,” Smart Grid, IEEE Transactions on,

vol. 3, no. 3, pp. 1562–1570, Sept 2012.

[59] Power factor correction for buildings: Power quality improved, Avail-

able: http://www.epcos.com/epcos-en/373562/tech-library/articles/applications—

cases/applications—cases/power-quality-improved/171824.

[60] W. H. Kersting, Distribution System Modeling and Analysis. CRC Press, Jan. 2012.

[61] B. Mashadi and S. Emadi, “Dual-mode power-split transmission for hybrid electric vehicles,”

IEEE Trans. Vehicular Technology, vol. 59, no. 7, pp. 3223–3232, 2010.

[62] Y. Li and N. Kar, “Advanced design approach of power split device of plug-in hybrid electric

vehicles using dynamic programming,” in Proc. of IEEE Vehicle Power and Propulsion

Conference, 2011, pp. 1–6.

[63] Y. Li and N. C. Kar, “Investigating the effects of power split PHEV transmission gear ratio to

operation cost,” in Proc. of IEEE Transportation Electrification Conference and Expo, 2012,

pp. 1–6.

[64] S. Moura, H. Fathy, D. Callaway, and J. Stein, “A stochastic optimal control approach for power

management in plug-in hybrid electric vehicles,” IEEE Trans. Control Systems Technology,

vol. 19, no. 3, pp. 545–555, 2011.

96

BIBLIOGRAPHY

[65] Q. Gong, Y. Li, and Z.-R. Peng, “Optimal power management of plug-in HEV with intel-

ligent transportation system,” in Proc. of IEEE/ASME Int. Conf. on Advanced Intelligent

Mechatronics, Sept. 2007.

[66] Y. Bin, Y. Li, Q. Gong, and Z.-R. Peng, “Multi-information integrated trip specific opti-

mal power management for plug-in hybrid electric vehicles,” in Proc. of American Control

Conference, June 2009.

[67] M. Zhang, Y. Yang, and C. Mi, “Analytical approach for the power management of blended-

mode plug-in hybrid electric vehicles,” IEEE Trans. Vehicular Technology, vol. 61, no. 4, pp.

1554–1566, May 2012.

[68] N. Schouten, M. Salman, and N. Kheir, “Fuzzy logic control for parallel hybrid vehicles,”

IEEE Trans. Control Systems Technology, vol. 10, no. 3, pp. 460–468, 2002.

[69] J. Gao, F. Sun, H. He, G. Zhu, and E. Strangas, “A comparative study of supervisory control

strategies for a series hybrid electric vehicle,” in Proc. of Asia-Pacific Power and Energy

Engineering Conference, 2009, pp. 1–7.

[70] M. Koot, J. T. B. A. Kessels, B. de Jager, W. P. M. H. Heemels, P. P. J. Van den Bosch, and

M. Steinbuch, “Energy management strategies for vehicular electric power systems,” IEEE

Trans. Vehicular Technology, vol. 54, no. 3, pp. 771–782, 2005.

[71] S. Adhikari, S. Halgamuge, and H. Watson, “An online power-balancing strategy for a parallel

hybrid electric vehicle assisted by an integrated starter generator,” IEEE Trans. Vehicular

Technology, vol. 59, no. 6, pp. 2689–2699, July 2010.

[72] Y. Qi and S. Ishak, “Stochastic approach for short-term freeway traffic prediction during peak

periods,” IEEE Trans. on Intelligent Transportation Systems, vol. 14, no. 2, pp. 660–672,

2013.

[73] S. Ishak, C. Mamidala, and Y. Qi, “Stochastic characteristics of freeway traffic speed during

breakdown and recovery periods,” J. Transportation Research Board, vol. 2178, pp. 79–89,

2010.

[74] X. Zhang and C. Mi, Vehicle Power Management: Modeling, Control and Optimization.

Springer, 2011.

97

BIBLIOGRAPHY

[75] F. V. Louveaux, “A solution method for multistage stochastic programs with recourse with

application to an energy investment problem,” Operations Research, vol. 28, no. 4, pp. 889–

902, 1980.

[76] J. R. Birge and F. Louveaux, Introduction to Stochastic Programming. Springer, 2011.

[77] T. W. Anderson and L. A. Goodman, “Statistical inference about markov chains,” The Annals

of Mathematical Statistics, vol. 28, no. 1, pp. 89–110, 1957.

[78] N. Bauerle and U. Rieder, Markov Decision Processes with Applications to Finance. Springer,

2011.

[79] ADVISOR Software for Advanced Vehicle Energy Analysis, Available:

http://bigladdersoftware.com/advisor/.

[80] P. Dey, S. Chandra, and S. Gangopadhaya, “Speed distribution curves under mixed traffic

conditions,” J. Transp. Eng., vol. 132, no. 6, pp. 475–481, 2006.

[81] D. Berry and D. Belmont, “Distribution of vehicle speeds and travel times,” 1951, pp. 589–602.

[82] K. Ahn and P. Y. Papalambros, “Engine optimal operation lines for power-split hybrid electric

vehicles,” The Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering,

vol. 223, pp. 1149–1162, 2009.

[83] A. May, Traffic flow fundamentals. Prentice-Hall, Englewood Cliffs, New Jersey, 1990.

[84] A. Messmer and M. Papageorgiou, “Metanet: A macroscopic simulation program for motor-

way networks,” Traffic Eng. and Control, vol. 31, no. 9, pp. 466–470, 1990.

[85] I. Prigogine and F. C. Andrews, “A boltzmann-like approach for traffic flow,” Operations

Research, vol. 8, no. 6, pp. 789–797, 1960.

[86] I. Prigogine and R. C. Herman, Kinetic Theory of Vehicular Traffic. Prentice-Hall, Englewood

Cliffs, New Jersey, 1990.

[87] T. Szczuraszek and R. Krystek, “A macroscopic model for traffic speed prediction on two-

lane roads,” Transportation Systems: Theory and Application of Advanced Technology, pp.

185–190, 1995.

98

BIBLIOGRAPHY

[88] B. L. Smith and M. J. Demetsky, “Traffic flow forecasting: comparison of modeling ap-

proaches,” J. of Transportation Engineering, vol. 123, no. 4, pp. 261–266, 1997.

[89] C. Chen, J. Hu, Q. Meng, and Y. Zhang, “Short-time traffic flow prediction with arima-garch

model,” in Proc. IEEE Symp. on Intelligent Vehicles, June 2011.

[90] B. Zhang, K. Xing, X. Cheng, L. Huang, and R. Bie, “Traffic clustering and online traffic

prediction in vehicle networks: A social influence perspective,” in Proc. of IEEE INFOCOM,

2012, pp. 495–503.

[91] V. Hodge, R. Krishnan, T. Jackson, J. Austin, and J. Polak, “Short term traffic prediction

using a binary neural network,” in Proc. Annu. Universities’ Transport Study Group Conf.,

Jan. 2011.

[92] J. Park, D. Li, Y. Murphey, J. Kristinsson, R. McGee, M. Kuang, and T. Phillips, “Real time

vehicle speed prediction using a neural network traffic model,” in Proc. of Int. Joint Conf. on

Neural Networks, July 2011, pp. 2991–2996.

[93] S. Rogers and W. Zhang, “Development and evaluation of a curve rollover warning system for

trucks,” in Proc. IEEE Intelligent Vehicles Symp., June 2003, pp. 294–297.

[94] J. McNew, “Predicting cruising speed through data-driven driver modeling,” in Proc. IEEE

Conf. on Intelligent Transportation Systems, Sept. 2012, pp. 1789–1796.

[95] D. Li, T. Bansal, Z. Lu, and P. Sinha, “Marvel: Multiple antenna based relative vehicle

localizer,” in Proc. Int. Conf. on Mobile Computing and Networking, 2012, pp. 245–256.

[96] L. Rabiner, “A tutorial on hidden markov models and selected applications in speech recogni-

tion,” in Proc. the IEEE, vol. 77, no. 2, Feb 1989, pp. 257–286.

[97] A. Jain, J. Mao, and K. Mohiuddin, “Artificial neural networks: a tutorial,” Computer, vol. 29,

no. 3, pp. 31–44, Mar. 1996.

[98] J. A. Bilmes et al., “A gentle tutorial of the em algorithm and its application to parameter

estimation for gaussian mixture and hidden markov models,” Int. Computer Science Institute,

vol. 4, no. 510, p. 126, 1998.

[99] H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Automatic Control,

vol. 19, no. 6, pp. 716–723, Dec 1974.

99

BIBLIOGRAPHY

[100] W. Zucchini and I. L. MacDonald, Hidden Markov Models for Time Series: An Introduction

Using R. Chapman and Hall/CRC, 2009.

[101] J. J. Dziak, D. L. Coffman, S. T. Lanza, and R. Li, “Sensitivity and specificity of information

criteria,” College of Health and Human Development, The Pennsylvania State University,

Tech. Rep. 12-119, June 2012.

[102] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique occurring in the

statistical analysis of probabilistic functions of markov chains,” The Annals of Mathematical

Statistics, vol. 41, no. 1, pp. pp. 164–171, 1970.

[103] L. E. Baum, J. Eagon et al., “An inequality with applications to statistical estimation for

probabilistic functions of markov processes and to a model for ecology,” Bull. Amer. Math.

Soc, vol. 73, no. 3, pp. 360–363, 1967.

[104] L. E. Baum and G. Sell, “Growth functions for transformations on manifolds,” Pacific J. Math,

vol. 27, no. 2, pp. 211–227, 1968.

[105] Ponts et Chaussees Traffic Count, Available: http://www.pch.public.lu/trafic/comptage/index.html.

[106] A. Grzybek, G. Danoy, and P. Bouvry, “Generation of realistic traces for vehicular mobil-

ity simulations,” in Proc. ACM Int. Symp. on Design and Analysis of Intelligent Vehicular

Networks and Applications, 2012, pp. 131–138.

[107] VehiLux: Realistic Vehicular Traces for VANETS Simulator, Available:

http://vehilux.gforge.uni.lu/index.html.

[108] SUMO: Simulation of Urban MObility, Available: http://sumo.dlr.de/wiki/Main Page.

[109] C. Gawron, “An iterative algorithm to determine the dynamic user equilibrium in a traffic

simulation model,” Int. J. of Modern Physics C, vol. 9, no. 3, pp. 393–407, Dec 1998.

[110] Neural Network Toolbox, Available: http://www.mathworks.com/products/neural-network/.

[111] Hidden Markov Model (HMM) Toolbox for Matlab, Available: http://www.cs.ubc.ca/ mur-

phyk/Software/HMM/hmm.html.

100

optimization and management of cyber-physical systems: smart …rx... · 2019-02-13 · abstract of...

Documents