optimization and management of cyber-physical systems: smart …rx... · 2019-02-13 · abstract of...
TRANSCRIPT
Optimization and Management of Cyber-Physical Systems - Smart
Grid and Plug-in Hybrid Electric Vehicles
A Dissertation Presented
by
Bingnan Jiang
to
The Department of Electrical and Computer Engineering
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
in
Computer Engineering
Northeastern University
Boston, Massachusetts
August 2015
To my family.
ii
Contents
List of Figures v
List of Tables vii
Acknowledgments viii
Abstract of the Dissertation ix
1 Introduction 11.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Optimal Energy Management in smart microgrid 62.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Design of Distributed DR Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 DR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.2 Optimization problem formation . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Shared Cost-led µCHPs Management . . . . . . . . . . . . . . . . . . . . . . . . 152.4.1 µCHP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.2 Shared Cost-led µCHPs Management Strategy . . . . . . . . . . . . . . . 15
2.5 VRB Discharging Management with Q-Learning . . . . . . . . . . . . . . . . . . 182.6 Problem-solving Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.7 Bill Balancing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.8 Simulation and Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.8.1 Simulation Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.8.2 Result Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.8.3 Distributed DR Results and Analysis . . . . . . . . . . . . . . . . . . . . 26
3 Vehicle-to-Grid Reactive Power Compensation 313.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.2 V2G System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2.1 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.2.2 Model of On-board Charger . . . . . . . . . . . . . . . . . . . . . . . . . 34
iii
3.3 Multi-objective Optimization Formulation . . . . . . . . . . . . . . . . . . . . . . 353.3.1 Optimizing PEV Agent Benefits . . . . . . . . . . . . . . . . . . . . . . . 353.3.2 Optimizing Utility Grid Reactive Power Compensation . . . . . . . . . . . 373.3.3 Multi-Objective Optimization Formulation . . . . . . . . . . . . . . . . . 38
3.4 Multi-Objective Optimization Solution Approach . . . . . . . . . . . . . . . . . . 383.4.1 Problem linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.4.2 Normalized Normal Constraint Method . . . . . . . . . . . . . . . . . . . 403.4.3 Decentralized Algorithm Based on Lagrangian Decomposition . . . . . . . 40
3.5 Results and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 On-road PHEV Power Management in Vehicular Networks 504.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.2 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.1 Overview of the System . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.2.2 System Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 HIERARCHICAL POWER MANAGEMENT ALGORITHMS AND SOLUTIONS 564.3.1 Hierarchical Power Management Algorithms . . . . . . . . . . . . . . . . 564.3.2 Optimization Formulation and Solutions . . . . . . . . . . . . . . . . . . . 58
4.4 RESULTS AND ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 Traffic and Vehicle Speed Prediction in Vehicular Networks 725.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.2 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.3 Vehicle Speed Prediction System Design . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1 Traffic Speed Prediction with NN . . . . . . . . . . . . . . . . . . . . . . 755.3.2 Vehicle Speed Prediction with HMM . . . . . . . . . . . . . . . . . . . . 77
5.4 Road Network and Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . 805.5 Result and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6 Conclusion and Future Research 88
Bibliography 91
iv
List of Figures
2.1 Energy ecosystem in smart microgrid . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Scheme of the microgrid for a community . . . . . . . . . . . . . . . . . . . . . . 92.3 Microgrid management based on hierarchical optimization and bill balance . . . . 102.4 Model of Dynamic DR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 µCHP electric energy flow among contributors and beneficiaries in the microgrid . 222.6 Wind turbine power generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.7 Utility electricity price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.8 Cost comparison between our energy ecosystem and the conventional system . . . 262.9 Electricity consumption cost of one sample house in each day . . . . . . . . . . . . 272.10 Satisfaction degree of one house in each day . . . . . . . . . . . . . . . . . . . . . 272.11 Electric and thermal load demand of the community in the evaluated day . . . . . . 282.12 Total µCHPs and heat pumps generation in the community in the evaluated day . . 282.13 Hot water tank temperature in the evaluated house . . . . . . . . . . . . . . . . . . 292.14 Energy consumption cost of the community with µCHP system generation . . . . . 292.15 Energy consumption cost of the community with VRB discharging . . . . . . . . . 30
3.1 Electrical and geographical map layers of the V2G reactive power compensationsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Charger operation mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.3 Solution approach diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.4 Framework of the decentralized optimization with Lagrangian relaxation . . . . . . 423.5 Simulation case setup for a distribution feeder and locations of charging stations . . 443.6 Iterations of solving an anchor point in case 2 . . . . . . . . . . . . . . . . . . . . 463.7 Pareto optimal points solved in cases 2, 3, and 6 . . . . . . . . . . . . . . . . . . . 463.8 Duality gaps of Pareto optimal points in cases 2, 3, and 6 . . . . . . . . . . . . . . 473.9 Average scheduled PEV unit cost per convenience of Pareto points in 3 study cases 483.10 Total PEV drop penalty of Pareto points in 3 study cases . . . . . . . . . . . . . . 483.11 Average power loss ratios of three charging schemes in the 7 test cases . . . . . . 49
4.1 Scheme of on-road PHEV power management system . . . . . . . . . . . . . . . . 524.2 A PHEV model with PSD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.3 Unit cycle models for urban roads and freeway . . . . . . . . . . . . . . . . . . . 564.4 PHEV hierarchical mode for PHEV power management . . . . . . . . . . . . . . . 57
v
4.5 Diagram of stochastic programming for online PHEV power management . . . . . 594.6 UDDS driving cycle and a sample of generated stochastic driving cycle in spatial
domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634.7 Speed transition probabilities for urban roads with light traffic . . . . . . . . . . . 644.8 MDP policy maps for an urban unit cycle with length index il = 2, stage index
k = 15, light traffic, and remaining battery budget 0.015 kWh . . . . . . . . . . . 664.9 Expected fuel consumption of an urban unit cycle (Length index il = 2) with MDP
and CDCS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.10 Required torque on the final drive shaft . . . . . . . . . . . . . . . . . . . . . . . . 684.11 ICE and EM torque output and fuel consumption along distance in MSQP/MDP and
CDCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.12 Battery SOC along driving distance in a sample test driving cycle . . . . . . . . . . 704.13 Operation points on ICE efficiency map . . . . . . . . . . . . . . . . . . . . . . . 704.14 Fuel consumption comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1 Scheme of the 2-level vehicle speed prediction system . . . . . . . . . . . . . . . . 745.2 Diagram of a NN neuron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765.3 NARX NN Model for Traffic Speed Prediction . . . . . . . . . . . . . . . . . . . 765.4 Left-to-right HMM for vehicle speed prediction . . . . . . . . . . . . . . . . . . . 785.5 Luxembourg road network in SUMO . . . . . . . . . . . . . . . . . . . . . . . . . 805.6 Procedure of data preparation for traffic prediction based on simulation . . . . . . . 815.7 Road set for prediction in Luxembourg motorway network . . . . . . . . . . . . . 825.8 Traffic prediction result for road segment #7 with one prediction period ahead . . . 825.9 Traffic speed prediction RMSE of all road segments . . . . . . . . . . . . . . . . . 835.10 AIC and BIC values for HMMs with different (Q, M) configurations . . . . . . . . 845.11 Comparison between HMM sampling and simulation observation for one road segment 855.12 Vehicle speed prediction RMSE of TSAP, NN/KDE and NN/HMM (∆k = 1) . . . 855.13 Vehicle speed prediction MAPE of NN, NN/KDE and NN/HMM (∆k = 1) . . . . 865.14 Vehicle speed prediction RMSE of NN/KDE and NN/HMM with different ∆k . . . 865.15 Histogram and pdf of vehicle speed prediction absolute error for road segment #5 . 87
vi
List of Tables
3.1 Parking interval and station capacity configurations for different cases . . . . . . . 45
4.1 Configuration of PHEV Powertrain with PSD . . . . . . . . . . . . . . . . . . . . 62
vii
Acknowledgments
First and foremost, I would like to thank my advisor, Prof. Yunsi Fei, for her support onmy research. Her excellent insights, guidance, and advice help me to strengthen my creativity, traincritical thinking ability, and stay on the right track throughout my PhD study. I have also learned alot of skills from her about paper writing and presentation. I would also like to thank my committeemembers, Prof. Waleed Meleis, Prof. Edmund Yeh, and Prof. Ningfang Mi, for their great advice onmy research proposal and dissertation. I am also thankful to Prof. Chee-Wooi Ten from MichiganTech University for his valuable advice on my research.
I would particularly thank my family for their endless love and support. They always giveme the confidence and courage to face up difficulties in this long journey. I cannot go through thiswithout their support.
Thank my lab mates for their help on my research and in my daily life. Working with youis enjoyable and will be a good memory. I also want to thank all my friends who bring funs to mylife and take away all my weariness.
viii
Abstract of the Dissertation
Optimization and Management of Cyber-Physical Systems - Smart Grid
and Plug-in Hybrid Electric Vehicles
by
Bingnan Jiang
Doctor of Philosophy in Computer Engineering
Northeastern University, August 2015
Dr. Yunsi Fei, Adviser
In cyber-physical systems (CPS), the bi-directional link between computational and physi-cal elements can significantly increase the efficiency, reliability, and cost-effectiveness of CPS. Aprecursor generation of CPS can be found in diverse applications, where smart gird and plug-inhybrid electric vehicles (PHEVs) are two exemplary vibrant applications. Compared with traditionalpower distribution systems and gasoline fueled vehicles, smart grid and PHEVs have much lowercost, higher service provision, and make the environment greener. However, it is challenging tomanage the operations of CPS optimally, in view of system complexity, interaction between cyberand physical components and the environment, limited computation resources, and high real-timeperformance requirement.
My dissertation has been focused on the optimization and prediction model design forcost-effective and energy-efficient CPS – smart grid and PHEVs. First, a novel cost-effective energyecosystem is proposed for a residential microgrid with renewable energy resources. It effectivelycoordinates demand response (DR), distributed generations (DGs), and energy storage managementthrough a three-level hierarchical optimization, in which particle swarm optimization (PSO) algorithmand environment-adaptive Q-learning algorithm are applied. Second, I explore the application ofmodern vehicle-to-grid (V2G) technologies on smart grid reactive power compensation. On-boardchargers of plug-in electric vehicles (PEVs) are proposed to be utilized as mobile volt-ampere reactive(VAR) resources. Third, an on-road PHEV power management system is proposed which utilizesthe information of stochastic vehicle driving states and real-time traffic conditions. With thesestochastic elements incorporated, a two-level hierarchical optimization model is developed basedon multi-stage stochastic quadratic programming (MSQP) and Markov decision process (MDP).The proposed system makes optimal on-road power management decisions and simulation results
ix
demonstrate its performance superior to existing methods in terms of fuel saving. Finally, a novelvehicle speed prediction algorithm is proposed in the context of vehicular networks. Vehicle speedprediction servers as important input to many vehicle applications, e.g., power management. A noveldata-driven vehicle speed prediction framework is proposed with the integration of neural network(NN) and hidden Markov models (HMMs). Prediction accuracy is improved in the proposed methodcompared with existing ones.
x
Chapter 1
Introduction
As more electricity-consuming products come into daily lives, e.g., electric vehicles (EVs)
and advanced HVAC systems, load demand is increasing dramatically and imposing new challenges
on existing power grid. Smart Grid, integrated with renewable energy generation, advanced metering
infrastructure, and information technologies, can cope with the impending global energy crisis and
environment deterioration. With great technological advance, the rapid developing plug-in electric
vehicle (PEVs) and plug-in hybrid electric vehicles (PHEVs) are taking place of traditional gasoline-
fueled vehicles for both cost and emission reduction. Both smart grid and PHEVs are exemplary
cyber-physical systems (CPS), where the close interaction between cyber and physical elements can
significantly improve the system efficiency, reliability, and cost. However, managing and optimizing
these CPS so as to take their full advantage is a challenging issue, due to the system complexity,
dynamics, environment uncertainty, limited resources, and high real-time performance requirement.
Smart grid is featured with renewable energy and distributed generation (DG), from which
cheaper and cleaner energy are supplied to users. However, their expensive infrastructure investment,
like the cost of wind turbines and high capacity batteries, would be one of major obstacles preventing
their popularization in ordinary households. One solution is to build a microgrid where energy
facilities are shared by the whole community with significant infrastructure cost reduction for each
household. In a microgrid, the load demand can be scheduled by demand response (DR) to increase
energy utilization efficiency[1]. With the price profile known, some load is shifted to off-peak hours
to reduce the energy consumption cost. Besides, distributed energy resources (DER), including
DG and energy storage system, should be optimally managed to minimize the energy generation
cost. Most existing works optimize DR and DER management separately, since it is challenging
to integrate them in optimization models on account of system complexities, i.e., different roles,
1
CHAPTER 1. INTRODUCTION
decisions, and large number of control variables. Extra cost reduction can be achieved if DR and
DER management is well coordinated. Another difficulty is to integrate stochastic elements, e.g.,
stochastic wind power and load demand, into optimization models, since their accurate mathematical
models are usually hard to build. To coordinate DR and DER efficiently in the energy ecosystem, a
new hierarchical optimization model in a multi-agent system is designed in this dissertation.
In addition to real power management, reactive power compensation is also a major concern
for energy-efficient and reliable smart grid, especially with the increasing load demand and DG
penetration. PEVs can be plugged into dedicated sockets for charging [2] and provide auxiliary
vehicle-to-grid (V2G) support simultaneously through their on-board bidirectional chargers [3, 4].
Reactive power compensation targets at power loss reduction, voltage regulation, power faction
correction, etc.[5, 6]. Reactive power compensation is traditionally provided during distributed
generations (DGs) [5] and by static volt-ampere reactive (VAR) compensators [7]. However, these
methods are limited to the fixed capacities and locations of reactive power resources. Recent research
results show that reactive power compensation from PEV on-board chargers do not affect their
battery’s lifetime [8]. In this dissertation, PEV on-board chargers are utilized as mobile VAR
resources to enhance the reactive power compensation for smart grid.
PHEV power management system is another CPS studied in my dissertation. A PHEV’s
powertrain is usually designed in a series mode, parallel mode, or series-parallel mode with power-
split devices (PSDs). In series PHEVs, torques from internal combustion engines (ICEs) are applied
to generators to generate electricity which is then supplied to electric motors (EMs) to generate
traction torques and drive vehicles. For parallel PHEVs, traction torques are generated by both ICEs
and EMs for long-distance driving. Many modern PHEVs, such as Toyota Prius PHEV, are designed
with PSDs to further increase energy efficiency. PSD introduces an extra control freedom for the
powertrain, i.e., ICE speed, so power decisions can be made more flexible and optimal according
to specific driving states. With different operation costs and efficiency characteristics, ICE and EM
are usually controlled together to achieve minimum fuel, electricity, or hybrid energy consumption.
Existing PHEV power management methods can be categorized into offline and online. Offline
management is usually formulated as an optimization problem based on historical driving cycles with
the assumption that future driving routes are known. This is easy for problem formulation but the
possible large difference between assumed future driving cycles and real ones will significantly affect
the management performance. Differently, online management makes power generation decision
at real-time, which adapts to instantaneous driving states. Constrained by the limited onboard
computation resources, online management algorithms are usually designed with low complexity,
2
CHAPTER 1. INTRODUCTION
like power balancing strategies, without utilizing trip information. Thus, online decisions are usually
not optimal for the entire trip. To take advantages of both online and offline management, a hybrid
power management system is designed to improve the system performance. It utilizes not only
historical driving cycles, but also real-time driving states, trip information, and traffic conditions.
Vehicle speed prediction serves as an important input for many vehicle specific applications
such as PHEV power management. Accurate vehicle speed prediction is challenging and needs to
incorporate many internal and external elements into prediction models, such as the vehicles type,
road types, and driving conditions. Traditionally, traffic and vehicle speed data are collected by
loop detectors and dedicated on-board equipment. These data collection equipment can hardly be
deployed densely in a road network for vehicle speed prediction due to their high cost. In the context
of vehicular network, more data traffic and driving data can be obtained from additional sources and
easily shared between vehicles and remote data center. Thus, facilitated by the new infrastructures
and enriched data, new data-driven algorithms can be designed to improve the accuracy of vehicle
speed prediction.
1.1 Contributions
This dissertation focuses on the optimal management and prediction system design for
cost-effective, energy-efficient, and reliable smart grid and PHEV CPS. The main contributions of
this dissertation are as follows:
• A novel cost-effective energy ecosystem in smart microgrid is proposed with a three-level
hierarchical optimization. The hierarchical optimization coordinates DR and DER management
and reduces the computational complexity. Interaction between users and DR is enhanced
by adopting users’ feedback. DR agents make decisions adaptable to user’s preference
change. Instead of optimizing each individual µCHP generation, all µCHPs are optimized
cooperatively for the whole community in a shared cost-led mode. An environment-adaptive
battery discharging management algorithm is designed based on Q-learning. It considers
stochastic elements in the microgrid and gives an optimal discharging policy.
• A V2G system is proposed to compensate reactive power for the grid during PEVs’ parking
and charging. The novelty of the proposed system lies in the utilization of PEV on-board
chargers as flexible and distributed VAR resources. PEV charging and parking are scheduled to
maximize benefits of both PEV owers and the utility grid. The scheduling is then formulated as
3
CHAPTER 1. INTRODUCTION
a multi-objective mixed integer nonlinear programming. The Normalized Normal Constraint
(NNC) method [9] is used to transform the multi-objective optimization to a set of single-
objective optimizations, each of which is solved to obtain a Pareto optimal solution (Pareto
point). Since the transformed single-objective optimization problems are nonlinear, they
are linearized into mixed integer linear programming (MILP) problems for efficient solving.
To make the solution approach scalable as the number of PEVs increases, a decentralized
algorithm is designed based on Lagrangian relaxation and decomposition.
• An on-road PHEV power management cyber-physical system is proposed in the context of
vehicular network. The objective of the proposed system is to minimize the fuel consumption
of a PHEV in a trip. The main contribution of this work is the design of a novel two-level
stochastic hierarchical power management system which utilizes vehicle real-time driving data,
vehicle speed prediction information, and historical driving cycles. The power management
consists of two steps, a high-level online battery budget allocation and a low-level offline
power policy generation. Decisions from the two-level optimizations are combined for PHEV
power management, which is optimal for individual driving trips.
• A vehicle speed prediction algorithm is proposed in vehicular networks. The objective is to
accurately predict individual vehicle speeds considering the the effects of traffic conditions,
road types, and driving behaviors. Its main contribution is to build a statistical model that
captures the relationship between traffic conditions and vehicle speeds for on-road vehicle
speed prediction. The novelty of the prediction model lies in the consideration of unobservable
driving states. The first-level neural network (NN) models predict traffic speed of road segments
according to historical traffic data. The traffic speed prediction result will serve as input to
the second-level model. In the second level, the statistical relationship between the individual
vehicle speed and traffic speed is modeled by hidden Markov models (HMMs), which are
trained offline with historical traffic and vehicle speed data. The traffic speed prediction result
is then plugged in to HMMs to achieve vehicle speed prediction on each road segment along
the driving route.
1.2 Thesis Organization
The rest of the dissertation is organized as follows. Chapter 2 describes the design of a
smart microgrid energy ecosystem. Functions and relations of system roles are first described in
4
CHAPTER 1. INTRODUCTION
the overview. The modeling and implementation of DR and DER management, as well as the bill
balancing algorithm, are then presented with details. Finally, power management simulation results
are shown in figures and the performance improvement is analyzed. In Chapter 3, we consider the
design of a V2G reactive power compensation system based on optimal decentralized scheduling. The
system scheme and charger’s model are first described. A multi-objective optimization formulation for
PEV scheduling is then presented. Algorithms are also introduced for Pareto points solving, problem
linearizion, and decentralized problem solving. Simulation is carried out based on a commercial
area in Boston and different case configurations are considered. From the results, benefits of both
PEV owners and utility grid are analyzed. Chapter 4 focuses on the design and implementation of
our proposed on-road PHEV power management system. The PHEV powertrain model and power
management objectives are first discussed. The design of two-level power management is shown
with the scheme diagram, formulation, and algorithm selection. The proposed power management
system is then tested on Toyota Prius in simulation. Simulation results are analyzed and the fuel
consumption of proposed system is compare with other existing systems. Our traffic and vehicle
speed prediction work is presented in Chapter 5. The scheme of two-level prediction system in
vehicular networks is described with a diagram. Details of each level prediction design are further
discussed. Simulation results are presented and the prediction accuracy is compared with other
methods. Chapter 6 concludes the whole dissertation.
5
Chapter 2
Optimal Energy Management in smart
microgrid
2.1 Background and Motivation
Recent works have focused on the design and analysis of DR and DG management systems
for smart home and smart grid. Work [10] discuss challenges relating to load forecast and DR.
Work [11] shows that the unpredictable human factors can influence DR system’s performance
significantly. Existing DR systems are designed with deterministic or stochastic algorithms in
centralized or decentralized ways. A stochastic dynamic programming method for electricity usage
is proposed in [12]. It assumes that system states’ transition probabilities, e.g. utility power price
and outdoor temperature, are known information. Work [13] proposes a residential DR algorithm
using Q-learning, which takes stochastic load demand, electricity price, and user’s convenience
into consideration. However, Q-learning can hardly be applied to complex tasks and price models
because the convergence speed is low when state and action space dimension are large. In work
[14], a decentralized load management control system is proposed based on real-time price. Overall,
most of existing methods neglect the interaction between DR system and the user, i.e., improving
accuracy of decisions by observing user’s manual adjustment. Therefore, these DR systems cannot
accommodate users’ preference changes and may give unsatisfactory decisions.
For DG management in microgrids, different optimization methods have been implemented.
Some work is based on the static load and weather forecast [15, 16], which doesn’t consider their
dynamic and stochastic characteristics in real situation. Work [17] proposes a scheduling method for
6
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
hybrid supplies considering stochastic elements. The obtained decisions are optimal for the average
performance of all possible situations, but cannot adapt to an actual instance. In work [18], a robust
energy management for microgrid with intermittent renewable energy resources is proposed and the
worst-case transaction cost is included in the cost function. As a new type of clean DG with high
energy efficiency and low emission, micro combined heat and power systems (µCHPs) have recently
attracted much attention and become a promising DG in residential homes. µCHP control strategies
can be categorized as heat-led, electricity-led, and cost-led [19, 20, 21]. In heat-led or electricity-led
strategy, the µCHP generates energy whenever there is an electricity or heat demand, respectively.
The Cost-led strategies proposed in [20, 21] utilize the characteristic of co-generation to achieve
the minimum overall cost. Under this strategy, extra electricity will be exported to the utility grid
and additional heat will be consumed by thermal energy storage, like a hot water tank. The existing
strategy only focuses on the optimal operation of a single µCHP. The issue of coordination among
multiple µCHPs for cost reduction in smart microgrid has not been addressed and will be explored in
our proposed energy ecosystem.
2.2 System Overview
The ecosystem is described in Fig. 2.1 with 6 interacted components: utility grid, renewable
energy, DG, storage, appliance and users. Behaviors of one component will affect those of others.
DG generates energy locally by harvesting renewable energy resources or using fuel provided by
utility company. Utility grid, DG, and energy storage provide energy to appliances which will provide
services to users. Extra generated energy is stored in battery and thermal energy storage for future
use. The operation time of appliances is scheduled in DR for cost effectiveness. Finally, users enjoy
services and pay their bills. The scheme of the microgrid studied in this dissertation is shown in
Fig. 2.2. It works in the grid-connected mode and includes three types of flows: electric power flow,
thermal power flow and information flow. The electric power demand is supplied by the utility grid,
centralized wind turbines and batteries, and distributed µCHPs. The thermal power demand in each
individual house is supplied by µCHP or electric heat pump. Wind turbines are shared by the whole
community. Each household has its subscription rate indicating the amount of wind and battery
power that can be used. µCHPs generate power according to the load demand in the microgrid. In the
grid-connected microgrid, power generation and consumption are balanced. When load demand is
higher than generation, extra power will be supplied by utility grid. If extra electricity is generated, it
will be sold back to the utility grid. Current policies usually allow wind energy to be sold with retail
7
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Biomass
Figure 2.1: Energy ecosystem in smart microgrid
rates while µCHP energy with lower avoided cost rates [22]. For general consideration, some houses
in the community are installed with µCHPs, whereas others are not. For the latter, thermal energy
can only be provided by electric heat pump. Extra thermal energy generated by µCHP can be either
stored in the hot water tank or dumped. The temperature of water tank should also be maintained
within a range along time. Batteries belong to the community. They discharge in peak hours to reduce
cost. They also work as standby power supplies for emergent blackouts. We select the vanadium
redox battery (VRB) rather than the conventional deep cycle lead-acid batteries because VRB has
much longer life cycle, higher efficiency, and lower discharging cost [23, 24, 25]. The information
flow contains utility power price, wind power prediction, users’ input, system status, control signals
from agents, etc.
The three-level hierarchical optimization is depicted in Fig. 2.3. Load demand and power
supply in the system are decoupled in this hierarchy. DGs can be categorized into two types:
uncontrollable ones determined by external environment like wind turbines , and controllable ones
like µCHPs. The hierarchical optimization first realizes DR based on the wind power generation
8
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Inverter
DA
Distributed Agent
DA DA
Water Tank
Micro CHP
Circuit Breaker
Smart Meter
Centralized Agent
Wind
Turbine VRB
Electric
Power Flow
Thermal
Power Flow
Information
Flow
Heater
Figure 2.2: Scheme of the microgrid for a community
and then manages µCHPs’ generation and VRB discharging. The time resolutions at different
optimization levels are set the same for easy synchronization.
The lowest level is DR executed by the distributed agent (DA) of each house for energy
consumption cost reduction with users’ satisfaction taken into consideration. Distributed DRs can
significantly reduce computational complexity without losing much optimality. In each house, DA
collects relevant external information, e.g., day-ahead time-varying utility price and wind power
prediction, and realizes dynamic DR at every decision time, i.e., every hour. DR results include
the starting time of each schedulable task. The optimization formulation in DR considers power
supply from utility grid and wind turbine (subscribed wind power of each household), but not
µCHPs or VRB. This decoupling is reasonable since the cost and ability of µCHP generation do not
vary along the time. The VRB discharging capability does not change much unless it is depleted.
Thus, DR optimization results are not affected much when power supply from µCHPs and VRB
are not considered. DR results are updated dynamically in each decision period with new external
information or new added load.
At the second level, a centralized agent gets load demand of each house from DR decisions
and optimizes µCHPs’ generation. At one time, some houses may have high electric load demand
that cannot be supplied merely by their subscribed wind power and µCHP self generation, while
others with low demand do not need µCHP generation. Thus, this dissertation considers the potential
9
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Dynamic DR
(Distributed Agent)
Load Demand
Wind Forecast
Utility Price
CHP Management
(Centralized Agent)
VRB Management
(Centralized Agent)
Load
Scheduling
Micro CHP
Generation
Unsupplied
Load by Wind
Unsupplied Load
by Wind and CHP
VRB
Discharging
Extra Wind
Power
Bill Balance
(Centralized Agent)
Cost of Each
House
Subscription
Rate Balanced Bill
Deterministic
or Stochastic
Input
Figure 2.3: Microgrid management based on hierarchical optimization and bill balance
improvement of energy generation efficiency by coordinating distributed µCHPs during optimization
and proposes the shared cost-led µCHP management strategy. In this strategy, instead of generating
power for its own house, generation of all µCHPs is coordinated to minimize the cost of the whole
community. The optimization agent first calculates the remaining load demand in the community
after deducting the predicted wind power supply. Since µCHPs generate electric and thermal power
simultaneously and have higher power output than battery, generation of µCHPs is first optimized
at this level to supply the remaining load. The optimization considers DR decisions from the first
level as well as utility power price and wind power prediction. VRB discharging is not considered in
optimization at this level.
The last level optimization is for VRB charging and discharging. Different from µCHPs
power generation, VRB can respond fast to load changes with charging/discharging. So its dis-
charging is optimized to compensate stochastic load and insufficient µCHP power generation in
the microgrid at the final stage. VRB is charged by the extra wind power if it is not full and the
power selling price is low. Obtaining an optimal discharging policy is challenging, determined by
the stochastic environment, i.e., load demand, utility price, and future available wind power. One
10
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
simple strategy is to discharge VRB whenever extra power (in addition to wind power and µCHP
power) is needed. It is not optimal because it does not consider the varying electricity price. Another
policy is to discharge VRB only in periods when utility electricity price is high. However, this will
create the situation that VRB is kept fully charged at most time and therefore the surplus wind power
cannot be stored. Mathematical modeling of an environment model for the microgrid is complex
and impractical. Therefore, we propose a reinforcement learning-based VRB discharging strategy
by evaluating decisions’ immediate and subsequent effects on the ecosystem. The centralized agent
gets load demand and µCHPs and wind generation information, and takes into account their possible
stochastic changes during policy making. With reinforcement learning, the discharging policy can be
obtained from the interaction between the agent and the environment without establishing its detailed
models.
With the three-level hierarchical optimization, the energy consumption cost for the whole
community is minimized. To guarantee fairness for all households, their utility bills need to be
balanced according to their energy consumption and generation at different time. For example, a
house with low subscription rate may have high load demand which consumes supplies from others’
subscribed wind power or µCHP generation. It is unfair for the latter to pay more gas fees or provide
their own wind power for the former without bill balance.
2.3 Design of Distributed DR Systems
2.3.1 DR Model
DR gives load scheduling decisions which are updated dynamically at different time
according to the change of load demand and wind power prediction. Electric load demands are
either schedulable or fixed energy consuming tasks. A schedulable task can be assigned to operate at
different time with different user’s satisfaction. For example, the working of laundry machine and EV
charging are schedulable tasks. It is also assumed that a scheduled task cannot be interrupted once it
starts. On the contrary, a fixed task is time-sensitive and must be executed at designated time, such
as the operation of refrigerator, watching TV programs at specific time, and turning on heating and
air conditioning by house residents. DR is designed for schedulable tasks and will give the optimal
scheduling solution at each decision time. A DR decision time point can be in three situations: at the
beginning of each hour, when a user adds new tasks, or when a user intends to adjust the scheduling
decisions.
11
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
New Requested Tasks
Pending Tasks
Interfered Tasks
Started Tasks
Fixed Tasks
Dropped Tasks
Scheduling
Decision
Utility Power Price
Wind Power Prediction
Task Preference Rate
Functions
Schedulable
Unschedulable
Input
Update
Dynamic Task Array
External Information
Distributed Agent
Input
Update
Input
Update
Figure 2.4: Model of Dynamic DR
The DR model diagram is shown in Fig. 2.4. It consists of dynamic task array, external
information, and task preference rate functions as input. The output is DR scheduling. The scheduling
results will be updated according to new available information at each decision time. Dynamic task
array consists of six types of tasks as listed below. Only the first three types are schedulable in DR.
• New requested tasks: new tasks requested by the user.
• Pending tasks: scheduled tasks which do not yet start.
• Interfered tasks: tasks whose scheduling time is adjusted by the user.
• Started tasks: scheduled tasks that have already started.
• Fixed tasks: tasks strictly required to be executed at certain time.
• Dropped tasks: tasks dropped by the agent considering the maximum power constraint.
The external information includes the day-ahead time-varying utility electricity price and
hourly updated wind power forecast. Each task is associated with a preference function, which is
designed to indicate a user’s varying satisfaction dependent on the task’s starting time. Preference
functions are updated dynamically according to users’ preference change due to summer/winter
time switch, weather change, holiday seasons, short term change of living habit caused by irregular
working agenda, etc.
At each DR decision time k, the preference function F kpr,i(t) for task i is a function of task
starting time t. The preference function is based on fki (t), which is the estimated probability density
function (pdf) of task i’s starting time t. As a non-parametric density estimation method, the kernel
12
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
density estimation (KDE) method has broad applications in the univariate case [26] and is suitable
for estimating fki (t). Initially, f0i (t) is estimated from the historical task execution record as:
f0i (t) =1
Nh
N∑n=1
K
(t− Tnh
)(2.1)
where K is a symmetric probability density function, e.g., Gaussian density function, called kernel
function. Tn is the nth sample in the data set. N is the total number of samples in the data set. h
is the smoothing parameter called bandwidth, which determines the trade-off between estimation
bias and variance. Since the performances of different kernel functions are very similar, Gaussian
kernel is selected with its convenient mathematical properties. h is selected to minimize the mean
integrated square error (MISE) defined as:
MISE(f) = E
∫ [f(t)− f(t)
]2dt (2.2)
For Gaussian kernel, the optimal bandwidth is h∗ = 1.06σN−1/5, where σ is the sample standard
deviation. fki (t) at time point k is updated by processing new samples, i.e., tasks’ actual starting
time in either a regular way or with weighted update. The main idea is to weight user’s adjustment
and learn user’s preference change faster. If a task is scheduled by DR and accepted by the user, the
sample is processed with an ordinary update. If the scheduling is not accepted and rescheduled by
the user, it is updated with a weight M . The exact value of M is determined by a tunable parameter
ρ(ρ > 0) as M = max(2, bρNc), where N is the size of data set. The data set has its capacity.
When the data set is full and new samples come, the oldest ones will be replaced. F kpr,i(t) is set to be
equal to the normalized pdf fki (t)= fki (t)/max{fki (t)}.
2.3.2 Optimization problem formation
The length of a DR cycle is set at 24 hours for scheduling tasks. Because users desire more
satisfaction with lower cost, the optimization at each decision time is to minimize the unit cost, the
energy consumption cost per satisfaction, for the current and remaining time in the cycle. Monetary
cost is calculated as the product of electricity price (utility price and wind power price), load power
demand, and load duration. After discretization with time resolution τ , the energy consumption cost
at DR decision time kn for all tasks (i = 1, ..., I) is formed as:
C(kn) =
ND∑k=kn
[RW(k)ELW(k) +RG(k)ELG(k)
](2.3)
13
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
where ELW(k) and ELG(k) are wind and utility grid energy used in time slot k, respectively. Con-
sidering both the schedulable tasks and fixed tasks, ELW(k) is calculated according to user’s wind
power subscription rate αs and its price is RW(k). The extra energy demand ELG(k) will be supplied
by the utility grid with price RG(k). Power consumption from wind PLW(k) and grid PLG(k) at time
k are formulated as:
PLW(k) = min
{ ∑i∈IS,kn
Pi(k) + PF(k), αsPW(k)
}
PLG(k) = max
{0,∑
i∈IS,kn
Pi(k) + PF(k)− αsPW(k)
} (2.4)
where Pi(k) is the power consumption of task i at time k. Each task i requires TR,i time slots for
operation. Pi(k) equals to the rated power PR,i if time k is within the task operating time. Otherwise,
it is 0. The total satisfaction one user can get at decision time kn by scheduling tasks is:
U(kn) =∑i∈Ikn
Ui(kn) =∑i∈Ikn
uisi(kn)F knpr,i
(ksh,i(kn)
)(2.5)
Other variables and parameters include:
a) Control variables at decision time kn: si(kn) is a binary value indicating the scheduling decision
for task i at kn. “1” means task scheduling and “0” means not. It determines the set of scheduled
tasks IS,kn after decisions at time kn are made. ksh,i(kn) is the scheduled starting time of task i.
b) Power parameters: PW(k) and PF(k) are predicted total wind power supply (kW) and total load
of fixed tasks (kW), respectively, at time k.
c) Time parameters: ND is the number of time slots in one DR cycle.
d) Other parameters: RG(k) and RW(k) are utility electricity price (USD/kWh) and wind power
price (USD/kWh), respectively, at time k. ui is the weight coefficient reflecting the importance of
task i. Ikn is the set of tasks have not started till the beginning of DR decision at kn.
DR optimization constraints include: first, all scheduled tasks should be completed before
the end of DR cycle. No tasks are allowed to be postponed to the next day. Second, one task may
depend on the completion of another one, for example, a dryer can start to work only after the laundry
machine finishes washing. Third, execution time of each task should be scheduled between current
decision time and the end of DR cycle. At last, each house has its maximum allowed power restricted
by the circuit breaker.
14
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Tasks are allowed to be dropped when some constraints cannot be satisfied, such as the
situation when a user has tasks with high rated power which causes the total power exceeds the
maximum allowed one at any time. Task dropping penalty PT is thus introduced. The optimization
is to minimize the unit cost with task drop penalty considered:
minC(kn)
U(kn)+
[|Ikn | −
∑i∈Ikn
si(kn)
]PT
s.t. DR constraints
(2.6)
2.4 Shared Cost-led µCHPs Management
Operations of distributed µCHPs are optimized according to the load scheduling results
from the DR system. The µCHP model described in [21, 17] is applied in this dissertation.
2.4.1 µCHP Model
A µCHP unit has three statuses: idle, start-up and generation. When the system is idle,
there is no fuel consumption and power generation. In the start-up period, fuel will be consumed
without power generation. After start-up, the system consumes fuels and generates both electric
and thermal power. The efficiency of µCHP unit, η, denotes the percentage of total useful power
generated from fuel input. Specifically, electric efficiency ηE and thermal efficiency ηT indicate the
proportion of generated electric and thermal power. The total energy generation of µCHP unit at
time t is [21, 17]:
PC(t) = ηSC(t)gF (t)qF (2.7)
where:
SC(t)– binary status of µCHP. “0” for idle and start-up status. “1” for generation status.
gF (t)– fuel stream input(Nft3/s).
qF – heating value of fuel(kJ/Nft3). Thus, the generated electric power is PCE(t) = ηEPC(t) and
thermal power is PCT (t) = ηTPC(t).
2.4.2 Shared Cost-led µCHPs Management Strategy
In view of the dynamic characteristic of the micogird discussed above, including DR and
wind power forecast, it is important to ensure fast response of µCHPs to the load and supply changes.
15
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Otherwise, more power will be consumed from utility grid and extra cost will be induced. When
a µCHP unit is on, the fuel cell power output can be controlled to respond to the input change
within 30 seconds [27]. Frequent status change not only wastes lots of fuel on start-up but also
slows down the response speed. Thus, one possible strategy is to keep all µCHP units in generation
status and adjust their power output according to load demand by controlling the fuel stream input
gF (t). When a µCHP is on, there is minimum heat generation and may exceed the requirement.
However, there is only a limited amount of extra generated heat that can be stored in water tank.
First, the water tank has acceptable temperature range with most desired value. Second, heat dumps
impose negative effects on environment, which are usually restricted. In addition, this strategy is not
always cost effective. Therefore, it is important to determine the optimal “on/off” state of µCHPs
in advance according to available information. Since the amount of heat dump is limited, thermal
power generation should be constrained to keep desired water tank temperature.
The shared cost-led µCHPs management is formed as a two-level optimization problem.
The main idea is that the coarse-grained optimization has long-term perspectives and will guide the
fine-grained optimization in terms of µCHP state and thermal power generation along the time. The
fine-grained and coarse-grained optimization have time resolutions τ (slot, same as DR resolution)
and TC (period), respectively. TC is an integral multiple of τ . The CHP start-up time TS is also set
to an integral multiple of TC. The fine-grained optimization is to determine the detailed optimal
µCHP fuel input stream and electric heat pump generation for each slot τ in the current period TC
to minimize the energy consumption cost of the whole community. The coarse-grained one is to
minimize the sum of the approximate cost of the community in next NP coarse-grained periods
by determining the optimal µCHP states, the average fuel input volume, and the average electric
heat pump generation in each period. The solved optimal µCPHs’ states will be used for µCHPs’
state transitions. The total thermal generation in each period will serve as a constraint for the
fine-grained optimization in that period. For instance, if the status of one CHP unit is “off” in current
coarse-grained period but is preferred to generate power in next period (assuming the start-up takes
one period), the system will prepare for start-up in current period. This dissertation sets NP = 6 and
NS = 1. With this strategy, the predicted information is utilized, system responsibility is guaranteed
and optimal solution for cost reduction is obtained.
16
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
2.4.2.1 Fine-grained Optimization for Current Period
There are NC =TC/τ time slots in one period. The total load demand in the community
PSL(n) in time slot n is calculated according to DR decisions as PSL(n) = PF(n) +∑
i∈IS,n
Pi(n).
The electric power consumption cost CCE,k(n) and fuel consumption cost CCF,k(n) of the whole
community in time slot n of current period k are formulated as:
CCE,k(n) = RG(n)EEG(n) +RWEEW(n) (2.8)
CCF,k(n) = RFτ∑m∈MG
gF,m,k(n) (2.9)
where τ∑
m∈MG
gF,m,k(n) in (2.9) is the total µCHP fuel input volume within time τ . The fuel consump-
tion CCF,k is obtained as the product of fuel volume and fuel price. Variables and parameters in (2.8)
and (2.9) include:
a) Control variables: gF,m,k(n) is the fuel input stream of µCHP m at time n (Nft3/s).
b) Energy and power terms: EEW(n) and EEG(n) are calculated energy consumption (kWh) from
wind power supply and utility grid, respectively, at time n according to the scheduled load
demand PSL(n) and µCHP electric power generation PCHPE(n) in the microgrid. PCHPE(n) =
ηEqF∑
m∈MG
gF,m,k(n) where ηE is the electric efficiency of µCHPs and qF is the heating value of fuel
(kJ/Nft3).
c) Other parameters: RF is the fuel gas price (USD/Nft3). MG denotes set of houses with µCHP
states “generation”.
Extra generated power is sold to the utility grid with income BCM,k. The optimization problem at the
current decision time k is to minimize the total cost in the period as:
min
NC∑n=1
CCE,k(n) + CCF,k(n)−BCM,k(n) (2.10)
subject to the following constraints: First, the thermal generation of each house should be equal to
the value solved from coarse-grained optimization. Second, both fuel input and electric heat pump
generation have their allowable ranges. Finally, with electric heat pump added, the total electric
power consumption in a house cannot exceed the maximum value.
17
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
2.4.2.2 Coarse-grained optimization for Future Periods
The coarse-grained optimization has to consider NP periods. The cost function is the sum
of approximated cost of the whole community in these NP periods. Control variables are each CHP’s
state (binary values, 0 for idle state and 1 for generation state), its average fuel input volume (for the
units at generation state), and electric heat pump power consumption for heat generation. For each
time period, its approximated cost has the same formation as the fine-grained one, except that it has a
larger time resolution TC.
In addition to the constraint of thermal generation and maximum load, other constraints
include: first, in each periods, the temperature of water tanks should be maintained within in a range;
second, at a designated time, the temperature of water tank should reach the set point as the desired
average temperature; third, the maximum allowable heat dump is constrained.
2.5 VRB Discharging Management with Q-Learning
In this third level optimization, VRB will be optimized for discharging to supply the
remaining load after consuming wind and µCHP power at the first two level. This happens when
the load demand is high but the wind power is low or the µCHP generates insufficient power for
stochastic load demand. The efficiency of VRB is determined by its charging/discharging current
and the state of charge (SOC) with nonlinear characteristics. To keep high battery efficiency, the
charging/discharging current and SOC are constrained within certain ranges, in which the efficiency
can be approximated to be a constant value [25, 28].
The stochastic load demand and wind power can be modeled as Markov chains. VRB
management is formed as a Markov decision process (MDP) with the decision time resolution τ . At
decision time k, the system state space can be described as:
X(k) =[RG(k), PW(k), EINS(k), SDOD(k)
](2.11)
where EINS(k) is the state of remaining load demand energy calculated by applying wind power
PW(k) and µCHP generation PCHPE(k) to load demand PSL(k). EINS(k) = max{0, τ [PSL(k) −PW(k) − PCHPE(k)]}. SDOD(k) is the depth of discharge (DOD) state of VRB. The action is to
discharge µ(k) percent of EINS(k) from VRB at decision time k. Actions are constrained by
minimum and maximum VRB discharging power, as well as VRB’s SOC. The reward function for
action µ(k) is designed considering both the cost saving from discharging and system stability with
18
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
battery backup energy:
r(k) = a1(k)− λa2(k) (2.12)
where
a1(k) =
[RG(k)−RB
]µ(k)EINS(k)(
RG,max −RB)ED,max
(2.13)
a2(k) =∆SDOD(k)
1− SDOD(k + 1)(2.14)
a1(k) is the normalized cost reduction for the microgrid with VRB discharging. RB is the VRB
discharging cost (USD/kWh). RG,max is the maximum value of time-varying utility price. ED,max is
the maximum VRB discharging energy in a decision period. a2(k) is the normalized battery DOD
change weighted by battery SOC state at time k + 1. λ is a positive weight for a2. When a1(k) is
larger, which indicates more cost saving is achieved, the action is considered as cost effective and
denoted with larger reward. On the other hand, larger a2(k) means more energy is discharged from
VRB and therefore less energy is available as backup. In that case, the reward is reduced.
The MDP will find the optimal policy h∗ and action uk = h∗(X(k)) to maximize the total
reward that discounts the future rewards with a factor γ:
R =
∞∑n=0
γnr(k + n) (2.15)
2.6 Problem-solving Algorithms
The DR is formulated as a nonlinear integer programming problem. For the shared cost-led
µCHP management, the fine-grained optimization for the current period is a linear programming
problem and the coarse-grained optimization for the future periods is a mixed integer nonlinear
programing. These nonlinear programming problems are non-convex and finding their global optimal
solutions is NP-hard. Therefore, in DR and µCHP management, local optimal solutions are solved
by Particle Swarm Optimization (PSO) algorithm in real-time for practical operation [29]. Works
[30, 31, 32, 33] have evaluated PSO on different benchmarks and shown good solution qualities.
PSO has many advantages over other evolutionary algorithms, like Genetic Algorithm (GA) [34, 35].
First, PSO has more effective memory capacity and better diversity for optimal solution search.
Second, PSO has faster search speed which is important for highly dynamic systems, such as the DR
and shared cost-led µCHP management.
In PSO, a swarm S is a set of particles S = {x1, x2, . . . , xN}. N is the number of particles
participating in the solution search. Each particle is a vector xi = {xi,1, xi,2, . . . , xi,M}T indicating
19
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
its position in a M -dimension as the solution to minimize the cost function F (xi). The dimension
of each particle depends on the number of control variables. Each particle also has its velocity as
vi = {vi,1, vi,2, . . . , vi,M}T as the shift of position in each iteration. The swarm of particles will
update their velocities and positions, in each iteration k, towards target solution (with minimum
cost) by utilizing both individual best position pi(k) = {pi,1(k), pi,2(k), . . . , pi,M (k)}T and global
historical best position pg(k) = arg miniF (pi(k)). The update is realized according to the following
equations:
vi,j(k + 1) = ωvi,j(k) + c1r1[pi,j(k)− xi,j(k)
]+ c2r2
[pg,j(k)− xi,j(k)
]xi,j(k + 1) = xi,j(k) + vi,j(k + 1)
(2.16)
where r1 and r2 are random variables with uniform distribution in [0, 1]. c1 and c2 are acceleration
constants. ω is the inertial weight, a value decreasing with time. To prevent swarm divergence, the
velocity of jth component vi,j(k+1)is clamped as |vi,j(k+1)| ≤ Vmax,j = (bj−aj)/2 as a common
selection, where [aj , bj ] is the feasible region of xi,j . For the constraints of optimization, the method
of using penalty function and preserving feasibility of solution during initialization is adopted[32, 36].
For integral variables with a discrete search space, Discrete Particle Swarm Optimization (DPSO)
with rounding techniques proposed in [33] is used. For binary variables, a binary version of DPSO
with sigmoid function is used [37].
As a model-free reinforcement learning technique, Q-learning [38] is used to obtain optimal
VRB discharging policy through interactions with the environment. At decision time k, Q-learning
observes system state X(k), takes an action µ(k), evaluates reward r(k), and updates the Q value
with a learning rate α ∈ [0, 1] and discount factor γ. The Q-learning algorithm with ε− exploitation
is shown in Algorithm 2.
2.7 Bill Balancing Algorithm
To ensure fair energy usage for all households in the ecosystem, their bills need to be
balanced according to their energy consumption and generation along the time. Wind and battery
energy are allocated to a household with its subscription rate. In bill balancing, it is assumed a
household utilizes all of its subscribed wind and battery energy, supplying its load and selling the
extra energy to the utility grid. The bill balancing for µCHP energy generation and consumption is
more complex. At one time, a household performs as either a contributor (i ∈MC) or a beneficiary
20
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Algorithm 1 PSO algorithm1: Initialize each particle’s position and velocity randomly;
2: For particles with infeasible position, randomly adjust their components until all initial positions
are feasible;
3: for each iteration time do
4: for each particle do
5: Calculate fitness: fitness = cost function value + penalty;
6: if New individual best position is found then
7: Update individual best position;
8: end if
9: end for
10: if New global best position is found then
11: Update global best position;
12: end if
13: for each particle do
14: Update velocity according to individual and global best position;
15: Apply velocity clamping;
16: Update position;
17: end for
18: end for
19: Continue iteration if termination condition is not satisfied.
(i ∈ MB). A contributor exports a part of its µCHP electric energy to the microgrid or reaches a
balance between generation and demand without export. On the contrary, a beneficiary consumes
electric energy from other µCHPs in the microgrid. Their relationship in the microgrid is shown in
Fig. 2.5. In each time slot, a household i equipped with µCHP consumes fuel FCHP,i and generate
electric energy ECHPE,i and thermal energy ECHPH,i. ECHPE,i first supplies its own electric load E′L,i,
which is the remaining load of household i after utilizing its subscribed wind and battery energy.
EL,CHPE,i is the part of electric energy supply from µCHPs. For a contributor i, ECHPE,i ≥ EL,CHPE,i
and EoutCHPE,MG,i is exported to microgrid required by other households. The remaining generation is
sold to the utility grid as EoutCHPE,UG,i. For a beneficiary j, Ein
CHPE,MG,j is imported from other µCHPs
to supply its demand which is larger than EL,CHPE,j = EinCHPE,MG,j + ECHPE,j . The total µCHP
electric energy import matches export inside the mirogrid. With fairness consideration, EinCHPE,MG,j
21
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
Algorithm 2 Q-learning algorithm with ε-exploitation1: Q value initialization;
2: Initial state measurement;
3: for each step k, select action u do
4:
uk ←
random action selection with probability εk
u ∈ argmaxu′Qk(xk, u′) otherwise
5: Taking action uk, observe xk+1 and rk+1;
6:
Qk+1(xk, uk)← Qk(xk, uk) + αk[rk+1
+ γmaxu′
Qk(xk+1, u′)−Qk(xk, uk)
]7: end for
drawn from the microgrid should be proportional to the household’s load E′L,j . The net benefit a
household obtains from µCHP generation in the microgrid is determined by three parts: cost saving
from self-generation, cost saving from importing energy from microgrid, and fuel consumption
cost. It is fair for a household to get the full benefit from self-generation. Cost saving from energy
exporting/importing is achieved by both contributors and beneficiaries. Contributors also consume
more fuels to generate energy for beneficiaries. Therefore, the first two parts need to be balanced
among households.
Beneficiary Microgrid
Beneficiary
Beneficiary
…
…
Contributor
Contributor
Contributor
out
CHPE,MG,iE
in
CHPE,MG, jE
i
j
CHP,iF
L,CHPE,iE
out
CHPE,UG,iE
L,CHPE, jE
CHP, jF
CHPE, jE
Figure 2.5: µCHP electric energy flow among contributors and beneficiaries in the microgrid
22
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
In time slot k, the balanced bill of a household i has the following formulation:
BBi(k) =C ′G,i(k) + CW,i(k) + βiCF(k)− µiBCHP,share(k)
−BCHP,self,i(k)(2.17)
where:
C ′G,i(k) = RG(k)[EL,i(k)− αs,i
(EW(k) + EB(k)
)]CW,i(k) = RWαs,iEW(k)
CF(k) = RF∑
j∈MCHP
FCHP,j(k)
BCHP,share = RG(k)∑j∈MC
EoutCHPE,MG,j(k)
BCHP,self,i(k) =RG(k) min{ECHPE,i, EL,CHPE,i}
+RA(k)EoutCHPE,UG,i(k)
C ′G,i(k) is energy consumption cost of household i when only wind, battery, and utility grid energy
are considered as supply. C ′G,i(k) can be negative, which means subscribed wind and battery energy
is larger than its demand and the extra energy is sold to the utility grid. CW,i(k) is the charge of wind
turbine maintenance. CF(k) is the total fuel consumption cost for µCHP generation in the microgrid.
BCHP,share(k) is the total cost saving achieved by exporting/importing µCHP electric energy inside
the microgrid. BCHP,self,i(k) is the cost saving achieved by household from self-µCHP generation.
MCHP is the set of households with µCHPs. EL,i(k) is the electric energy demand of household i
in time slot k. EW(k) and EB(k) are total wind energy generation and battery energy discharging,
respectively. αs,i is the wind and battery energy subscription rate of household i. Different from
renewable energy, µCHP energy is sold with an avoided cost rate RA(k) which is lower than the
retail rate RG(k). To fairly balance CF(k) and BCHP,share(k) for each household, ratios βi and µi
should be well designed. It is fair for a household with larger EL,CHPE,i , EL,CHPH,i, and EoutCHPE,UG,i
to pay more for fuel consumption. BCHP,share(k) is achieved by µCHP energy sharing inside the
microgrid, which should be balanced according to EoutCHPE,MG,i and Ein
CHPE,MG,i of each household.
Thus, two metrics UCF,i and Ushare,i are designed to describe the fairness of balancing CF(k) and
23
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
BCHP,share(k), respectively, as
UCF,i = βiCF(k)/ECHPE,use,i (2.18)
Ushare,i =
µiBCHP,share(k)/EoutCHPE,MG,i i ∈MC
µiBCHP,share(k)/EinCHPE,MG,i i ∈MB
(2.19)
where:
ECHPE,use,i = (EL,CHPE,i + EoutCHPE,UG,i)/ηe + EL,CHPH,i/ηh
UCF,i is the unit fuel cost per µCHP energy usage, i.e., supplying load and selling to utility grid, in
which energy are weighted by µCHP electric efficiency ηe and thermal efficiency ηh. EL,CHPH,i is
the part of ECHPH,i for water tank heating. Ushare,i is the unit cost saving per µCHP electric energy
export/import inside the microgrid. βi and µi are designed following two rules. First, in consideration
of fairness, UCF,i , as well as Ushare,i, of each household should be equal. Second,∑i∈M
βi =∑i∈M
µi = 1
, where M is the set of all households in the ecosystem. Thus, βi and µi can be selected as:
βi = ECHPE,use,i/∑j∈M
ECHPE,use,j (2.20)
µi =
Eout
CHPE,MG,i/(2∑j∈MC
EoutCHPE,MG,j), i ∈MC
EinCHPE,MG,i/(2
∑j∈MC
EoutCHPE,MG,j), i ∈MB
(2.21)
After βi and µi are determined, the balanced bill for each household can be calculated according to
(2.17).
2.8 Simulation and Result
2.8.1 Simulation Configuration
The simulation platform is implemented with Java. The community is configured with 10
houses and 4 µCHPs. Suppose residents leave home at 8 AM, and each day starts at 8 AM and ends
at 8 AM of the next day. Each house is configured with its own fixed load, schedulable tasks (EV
charging, laundry machine, dryer, PC downloading, etc.) and preferred execution time periods. The
simulation for one week is first evaluated.
The wind velocity is generated according to Rayleigh distribution with an average speed
20m/s. It is assumed that there is 0-30% variance between each hourly updated wind forecast. There
24
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
is also 0-20% variance between the forecasted and actual wind power generation. The wind turbine
has 20kW rated power output, 3.1m/s cutting-in speed, 13.8m/s rated speed and 54m/s max speed.
The wind power generation is shown in Fig. 2.6.
1 2 3 4 5 6 7 80
5
10
15
20
Day
Win
d Po
wer
(kW
)
Figure 2.6: Wind turbine power generation
µCHPs are modeled with electric efficiency 0.27, thermal efficiency 0.63, gFmin =
0.0013Nft3/s, gFmax = 0.009Nft3/s. The hot water tank has temperature set point 65◦C with al-
lowable range±3◦C. VRB is first set with capacityEcap = 10 kWh for overall system evaluation. Its
discharging power is constrained with PD,min = 0.5 kW and PD,max = 4 kW. Its charging/discharging
round-trip efficiency is set to be 0.8. The time-varying utility price for simulation is generated based
on the critical peak pricing (CPP) model [39] and shown in Fig. 2.7. In the hierarchical optimization,
time resolution is set to be τ = 6 minutes. For µCHPs management, parameters are selected with
TC = TS = 30 minutes and NP = 6. In PSO, the number of particles in a swarm is selected as 100 and
500 for DR and µCHPs generation optimization, respectively. The maximum number of iterations is
selected as 1000. w is selected with the initial value 0.9 and c1 = c2 = 1. Q-learning parameters are
selected as α = 0.1, γ = 0.9, and u ∈ [0, 100%].
2.8.2 Result Analysis
The proposed energy ecosystem is first compared with a conventional distribution system
which is configured with DR and µCHPs but without the wind turbine and VRB. In the conventional
system, µCHPs are not interconnect and each of them generates power according to the heat demand
of its own house. The capital cost for a 20 kW wind turbine is about 70000 USD (20 years life-span)
with approximated maintenance cost 1.5% of the investment cost per year [40, 41]. The VRB
discharging cost can be approximated as 0.1 USD/kWh [42]. Even including the cost of the wind
25
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
1 2 3 4 5 6 7 80
0.1
0.2
0.3
0.4
Day
Util
ity P
rice
(USD
/kW
h)
Figure 2.7: Utility electricity price
turbine and VRB, results in Fig. 2.8 show that large cost reduction can be achieved in the ecosystem.
Performance of the hierarchical optimization is further evaluated. For easy analysis, the wind turbine
investment cost will not be included in the following analysis.
1 2 3 4 5 6 70
20
40
60
80
100
120
Day
Cos
t (U
SD)
Conventional SystemEcosystem (Energy Consumption Cost)Ecosystem (Investment and Maintenance)
Figure 2.8: Cost comparison between our energy ecosystem and the conventional system
2.8.3 Distributed DR Results and Analysis
The update of preference function Fpr is affected by the weight parameter ρ. When ρ is
small, Fpr changes slowly. If ρ is set too large, Fpr is over sensitive even to a single adjustment and
therefore forms bumps, which is inaccurate and will result in DR searching solutions in some local
regions. ρ = 0.1 is selected for the following simulations by observing its good trade-off between
the learning speed and accuracy. The energy consumption cost with and without DR are compared.
A randomly selected house is evaluated for the performance of DR. Results are shown in Fig. 2.9.
26
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
The normalized satisfaction U is used to present the influence of DR on users’ satisfaction. Without
DR, tasks start at users’ most preferred time with U = 1. U with DR for the house is shown in Fig.
2.10. Results show that with DR, the energy consumption cost of the house in each day has reduced
up to 43% while a high satisfaction is still achieved.
1 2 3 4 5 6 70
2
4
6
8
Day
Cos
t (U
SD)
Without DRWith DR
Figure 2.9: Electricity consumption cost of one sample house in each day
1 2 3 4 5 6 70.5
0.6
0.7
0.8
0.9
1
Day
Nor
mal
ized
Use
r's S
atis
fact
ion
Figure 2.10: Satisfaction degree of one house in each day
2.8.3.1 Centralized Shared Cost-led µCHP Management Results and Analysis
The two-level shared cost-led µCHP management is compared with the heat-led manage-
ment strategy. DR is applied in both strategies. To evaluate the performance, one day is selected
randomly and its electric and thermal load demand of the community are shown in Fig. 2.11. The
total µCHP total power generation and electric heat pump thermal power generation are shown in
Fig. 2.12. The hot water tank temperature of one house is regulated within the preset region shown
27
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
in Fig. 2.13. At the end of each NP coarse-grained period (every 3 hours), the desired temperature
setpoint is achieved. The energy consumption cost of the whole community in each day compared
with heat-led management strategy is shown in Fig. 2.14. Results show that the shared cost-led
µCHP management can reduce the energy consumption cost of the whole community up to 19%.
The exact cost reduction depends on wind power generation, utility power price and load variance of
the community.
8AM 12PM 4PM 8PM 12AM 4AM 8AM0
10
20
30
40
50
Time
Load
Dem
and
(kW
)
Electric Load DemandThermal Load Demand
Figure 2.11: Electric and thermal load demand of the community in the evaluated day
8AM 12PM 4PM 8PM 12AM 4AM 8AM0
5
10
15
20
25
30
Time
Pow
er G
ener
atio
n (k
W)
CHP Power GenerationHeat Pump Generation
Figure 2.12: Total µCHPs and heat pumps generation in the community in the evaluated day
2.8.3.2 Centralized VRB Management Results and Analysis
The performance of VRB management based on Q-learning is compared with the strategy
in which VRB discharges whenever the wind and µCHP power are insufficient to supply the total
load of the community. The total cost reduction of the community from VRB discharging in an
extended two-week simulation is shown in Fig. 2.15. In direct discharging mode, VRB discharges
28
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
8AM 12PM 4PM 8PM 12PM 4AM 8AM62
63
64
65
66
67
68
Time
Tem
pera
ture
(C
)
Figure 2.13: Hot water tank temperature in the evaluated house
1 2 3 4 5 6 70
10
20
30
40
50
60
Day
Cos
t (U
SD)
Heat-ledShared Cost-led
Figure 2.14: Energy consumption cost of the community with µCHP system generation
to supply the extra load whenever the utility price is higher than the VRB discharging cost. Cost
reduction increases when higher VRB capacity is applied. This trend becomes less significant when
the capacity is large, e.g., λ = 0.1 with capacity larger than 50 kWh, since VRB cannot be always
fully charged due to the limit of wind power generation. Compared to the direct discharging, the
proposed Q-learning method can achieve higher cost reduction. As λ increases, more weight is given
to energy reservation than load shaving.
29
CHAPTER 2. OPTIMAL ENERGY MANAGEMENT IN SMART MICROGRID
10 20 30 40 50 60 700
2
4
6
8
10
VRB Capacity (kWh)
Cos
t Red
uctio
n (%
)
=0.1 =0.4 =0.8 Direct Discharging
Figure 2.15: Energy consumption cost of the community with VRB discharging
30
Chapter 3
Vehicle-to-Grid Reactive Power
Compensation
3.1 Background and Motivation
V2G systems provide ancillary services to the grid, such as voltage/frequency regulation,
load shifting, and renewable energy supporting and balancing. Benefits and challenges of V2G
technologies are reviewed in [43]. Some challenges limit the implementation of V2G systems,
e.g., battery degradation, impacts on distribution equipment, and high investment cost. An optimal
scheduling of V2G energy and ancillary services is studied in [44]. The goal is to maximize profits
to the aggregator while providing peak load shaving and system flexibility to the utility and low EV
charging cost to customers. The problem is formulated as a linear programming problem and solved
in a centralized way. However, the scalability issue is not discussed. Work [45] studies the capacity
management of V2G system for voltage regulation with the model of queuing network considering
EVs’ dynamic connections determined by drivers’ habits. The estimated capacity is used to set up
contracts between an aggregator and a grid operator for optimal grid support and maximum profits.
V2G load frequency control is studied in [46] with PEV users’ convenience (battery SOC for driving)
taken into consideration. Results show that the control performance is worse by considering users’
convenience, while this difference becomes smaller as the number of PEVs increases.
The potentials and characteristics of PEV bidirectional chargers working for reactive power
compensation are studied in [8, 47]. Work [48] proposes a V2G reactive power compensation system
for Wind DG units connected with a PEV charging/parking lot. The problem is formed as a two-stage
31
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
Stackelberg game and an optimal pricing scheme for the compensation performance is derived. In
work [49], a combined frequency and voltage regulation system based on PEV real and reactive
power compensation is proposed with two joint optimization models implemented, i.e., a command
based model and a price based model. Results show the trade-off between real and reactive power
compensation and the advantage of regulation. However, these works utilize PEVs as fixed VAR
resources belonging to the specific charging/parking lots. Their roles as mobile and flexible VAR
resources for the smart grid are not studied. The compensation performance will be greatly improved
if PEVs are scheduled for charging/parking considering the load profiles. In this case, additional
complexities are introduced to the system design and analysis, such as multi-objective cost function
and scalable problem solving approach, which will be investigated in this work.
3.2 V2G System Description
The power distribution system is assumed to be connected with distributed on-street
charging stations. Each PEV equips with an on-board bidirectional AC charger. The V2G system is
overall a Cyber-Physical system. The Cyber part includes the information platform for reservation,
decentralized PEV parking and charging scheduling algorithm, real-time bus monitoring platform,
and PWM control system for the full-bridge inverter charger. The physical part involves PEVs and
their on-board chargers as actuators. The concentration of our work lies in the PEVs’ reservation and
their parking/charging scheduling algorithm in the Cyber part.
3.2.1 System Overview
The infrastructure of the proposed V2G system can be described hierarchically as two inter-
acting layers, an electrical layer and a geographical map layer, shown in Fig. 3.1. The geographical
map layer shows the streets, the location of charging stations, and PEV owners’ destinations reflecting
PEV owners’ parking convenience. The electrical layer consists of electrical facilities, including
feeders, charging stations, etc. PEVs at charging stations are controlled for both battery charging and
reactive power compensation to grid buses. It is assumed that there are multiple charging stations
connected to the same bus along one road segment. These stations are grouped and modeled as one
charging station with its capacity of parking/charging spaces. A PEV owner drives his or her car to a
charging station from a starting point, parks/charges the car and then walks to the destination. An
example is shown in Fig. 3.1 at the top layer. A PEV owner has two acceptable charging stations
32
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
within the walking distance from the destination. Thus, two possible driving and walking routes are
labeled as a and b, in which route a is preferred by the owner since it has a shorter walking distance
than b.
PEVs are scheduled for parking and charging according to the system infrastructure and
PEV owners’ day-ahead reservations. Scheduling is necessary to reduce competitions for limited
number of charging stations. Some people may prefer free driving styles without making reservations.
Their random accesses to charging stations cannot be guaranteed and can only be handled on the best-
effort basis. This case is not considered in our work. To make a reservation, a PEV owner submits the
charging request to the scheduling system before a deadline, indicating acceptable charging stations,
energy charging requirement, preferred parking interval, and maximum acceptable walking distance.
The scheduling system will satisfy users’ charging requests, offer them convenient parking service,
and reduce the cost of charge as much as possible. Parking convenience is represented by the parking
interval and walking distance. Each user has a preferred parking interval, which can be adjusted
within a range with a convenience degradation. Users are also sensitive to the walking distance
between stations and their destinations and would prefer the nearest ones. Monetary cost consists of
charging and parking costs. Charging and parking prices are time-varying and different for stations.
In addition to users’ benefits, the scheduling also makes decisions for the optimal reactive power
compensation to the grid.
1 2 3
5 4
a b
charging station
starting point
destination
1
4
2
5
3
charging station transformer feeder load
Figure 3.1: Electrical and geographical map layers of the V2G reactive power compensation system
33
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
Pch
Qcp
Smax
0 P'ch
Q'cp,max
Q'cpS'
Charging
Inductive
Discharging
Inductive
Charging
CapacitiveDischarging
Capacitive
Figure 3.2: Charger operation mode
3.2.2 Model of On-board Charger
The PEV charger model described in [8, 47] is used in our system. The charger can operate
in one of eight modes shown in Fig. 3.2 according to values of Pch and Qcp, which are real and
reactive power exchange between the charger and the grid, respectively. We do not consider the
modes with battery discharging, i.e, Pch ≥ 0. However, the problem formulations and designed
algorithms can be easily applied to situations with battery discharging by adjusting cost functions
and constraints. The polarity of Qcp represents different modes for reactive power. When Qcp > 0,
the charger operates in an inductive mode and consumes reactive power from the grid, while when
Qcp < 0, it is in a capacitive mode and compensates reactive power to the grid. Smax is the maximum
apparent power that can be sustained by the charger, determined by the grid voltage Vs and the
charger’s maximum allowable current Imax as Smax = VsImax. Smax sets constraints on Pch and Qcp
by subjecting to P 2ch +Q2
cp ≤ S2max in operation. Therefore, for a charging station, its capability of
reactive power compensation Qcp,max is determined not only by the number of connected chargers,
but also the real charging power.
34
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
3.3 Multi-objective Optimization Formulation
Two objectives are considered in the aggregator for PEV parking and charging scheduling.
One is for PEV agent benefits, including low monetary cost and high parking convenience. The
other optimizes the reactive power compensation to the grid. A multi-objective optimization is then
formulated in the following subsections.
3.3.1 Optimizing PEV Agent Benefits
On the PEV side, the parking cost, charging cost, and parking convenience of all PEV
agents are considered in the optimization. The objective of PEV side optimization is to minimize the
sum of unit costs per convenience for all PEV agents in the target area. For one PEV agent i, the
parking cost Cpk,i and charging cost Cch,i are presented as functions of parking/charging prices and
scheduling variables:
Cpk,i = τ∑j∈Mi
T∑t=1
yi,j(t)Rpk,j(t) (3.1)
Cch,i = τ∑j∈Mi
T∑t=1
Pch,i,j(t)Rch,j(t) (3.2)
where:
yi,j(t) = ui(t)xi,j (3.3)
Mi is the set of parking stations within the maximum acceptable walking distance of PEV agent
i. T is the number of time slots considered in the optimization. τ is the duration of each time slot.
Rpk,j(t) and Rch,j(t) are the parking and charging price of station j at time t, respectively. Control
variables include:
xi,j : binary parking assignment variables. xi,j = 1 indicates PEV i is assigned to station j for
parking.
ui(t) : binary parking status of PEV i. ui(t) = 1 if and only if PEV i parks at time t.
yi,j(t) : binary variables of PEV parking status at station j and time t. yi,j(t) = 1 if and only if
PEV i parks at station j at time t.
ts,i, te,i: parking interval [ts,i, te,i] of PEV i. They are integer variables and can be scheduled
within a range.
35
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
Pch,i,j(t) : charging power of PEV i at station j at time t. It is a continuous variable in the range
[0, Pmax,i] where Pmax is the maximum charging power.
The control variables are interdependent with following constraints:
Pch,i,j(t) ≤ yi,j(t)Pmax,i ∀i, j, t (3.4)
(ts,i − t)ui(t) ≤ 0 ∀i, t (3.5)
(t− te,i)ui(t) ≤ 0 ∀i, t (3.6)T∑t=1
ui(t) = te,i − ts,i + 1 ∀i (3.7)
Constraint (3.4) ensures Pch,i,j(t) = 0 if PEV i is not scheduled for charging at time t at station j.
Constraints (3.5)-(3.7) guarantee that ui(t) = 1 if t ∈ [ts,i, te,i] and ui(t) = 0 otherwise.
The parking convenience Si of PEV agent i comprises the walking distance convenience
Swk,i and the parking time convenience Spt,i. A larger Si indicates a service of higher quality is
provided to PEV agent i. Swk,i decreases with the increase of walking distance. Spt,i can be reduced
by adjusting the PEV agent parking interval from the preferred one. Since the walking distance and
parking time interval have different scales and units, Swk,i and Spt,i are normalized and included in
Si as:
Swk,i =
dmax,i −∑j∈Mi
xi,jdi,j + ε
dmax,i − dmin,i + ε
Spt,i =
t∗e,i∑t=t∗s,i
ui(t)−t∗s,i−1∑t=1
ui(t)−T∑
t=t∗e,i+1
ui(t)
t∗e,i − t∗s,i + 1
Si = αSwk,i + (1− α)Spt,i
(3.8)
where α ∈ (0, 1) is a weight coefficient. di,j is the walking distance from parking station j to
the destination of PEV agent i. dmax,i is the maximum di,j . ε is a small positive value ensuring
nonnegative denominator when dmax,i = dmin,i. [t∗s,i, t∗e,i] is the preferred parking interval of PEV
agent i. Swk,i is always a positive value. Spt,i is constrained nonnegative in the optimization.
The PEV monetary cost of an agent per convenience defines its cost function. The system
tries to schedule as many PEVs as possible. However, due to the limited station capacity, some
reservations have to be dropped to get feasible solutions. For each unscheduled PEV i, a constant
drop penalty PTi is included in the cost function, which is defined as:
PTi = Call,i/Si + ε (3.9)
36
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
where Call,i and Si are the maximum possible monetary cost and minimum satisfaction rate for the
PEV owner i, respectively, if it has been scheduled. ε is a positive constant.
The cost function for PEV agents is the sum of their cost per convenience plus the drop
penalty, which is to be minimized:
min CPEV =∑i∈N
[Cch,i + Cpk,i
Si+ (1−
∑j∈Mi
xi,j)PTi
](3.10)
where N is the set of all registered PEVs. Constraints for the scheduling include:
• Assignment and capacity: One PEV can be assigned to at most one charging station. At any
time, a station cannot be scheduled with PEVs more than its capacity.
• Service: Users’ battery charging requests should be satisfied if their cars are scheduled. Each
PEV agent also has a minimum acceptable convenience rate and maximum acceptable cost per
convenience.
• Charging and compensation: Chargers’ real and reactive power are constrained by their
maximum apparent power Smax. When a PEV is not scheduled for charging, its charging
power should be 0.
3.3.2 Optimizing Utility Grid Reactive Power Compensation
The objective for the utility grid is to minimize the total insufficiency of VAr reservoir.
For each charging station, the reactive power compensation requirement of its connected load at a
time is estimated from historical data. It can be either reactive power consumption (inductive and
positive) or injection (capacitive and negative). When the magnitude of reactive power compensation
is determined, chargers can be controlled for reactive power injection or consumption. The objective
is to minimize the total gap between requirement and compensation of all stations along the time as:
min Cutl =∑j∈M
T∑t=1
[Qreq,j(t)−
∑i∈Nj
Qi,j(t)
](3.11)
where M is the set of all stations. Nj is the set of PEV agents for whom the station j is within their
maximum walking distances. Qreq,j(t) is the magnitude of reactive power compensation requirement
at station j at time t. Qi,j(t) > 0 is a continuous control variable indicating the magnitude of
reactive power compensation from the charger of PEV i for Qreq,j(t). The optimization also includes
37
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
control variables appearing in the PEV side optimization. Besides the constraints in the PEV side
optimization, additional constraints include:
(Power Constraint) Q2i,j(t) + P 2
ch,i,j(t) ≤ S2max,i (3.12)
(Nonnegative Gap) Qreq,j(t)−∑i∈Nj
Qi,j(t) ≥ 0 (3.13)
(Power Availability) Qi,j(t) ≤ yi,j(t)Smax,i (3.14)
where Smax,i is the maximum apparent power of the charger in PEV i. Constraint (3.14) ensures that
Qi,j(t) = 0 when PEV i is not parked at station j at time t.
3.3.3 Multi-Objective Optimization Formulation
The multi-objective optimization problem is formulated as (3.15) considering both benefits
for PEV agents and the utility grid, and will be solved for control variable values under the constraints.
For (3.15), Pareto points (multiple optimal solutions) are solved as feasible solutions that do not
dominate each other, i.e., keeping the trade-off between multiple objectives.
min {CPEV, Cutl} (3.15)
s.t. all constraints for (3.10) and (3.11)
3.4 Multi-Objective Optimization Solution Approach
The solution approach of the multi-objective optimization (3.15) is shown in Fig. 3.3,
including linearization, problem reformulation by using NNC method, and problem solving by using
decentralized algorithm.
3.4.1 Problem linearization
The multi-objective optimization (3.15) is a mixed integer non-linear programming prob-
lem. It has non-linear cost function CPEV with fractional components. There also exist non-linear
constraints such as (3.3), (3.5), and (3.6) with bilinear terms, and the quadratic constraint (3.12). To
solve (3.15) efficiently, some techniques are used to reformulate it as a MILP problem.
The bilinear terms in constraints are products of a binary variable and an integer or
continuous variable, which can be linearized with the Glover’s linearization scheme [50]. The
constraint (3.12) is reformulated as Qi,j(t) ≤ g(Pch,i,j(t)) where g(Pch,i,j(t)) is the piecewise linear
38
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
Fractional Terms
Bilinear Terms
Quadratic Terms
PEV Side
Grid Side
MILP
MILP
PEV Side
Grid Side
Introduce extra variables
Glover’s linearization scheme
Piecewise linear approximation
Normalized normal constraint (NNC) method
Single-Objective
Pareto 1
Pareto m
Pareto 2
Solve Pareto pointsDecentralized Algorithm Based
on Lagrangian Decomposition
Nonlinear Terms Linearization
Multiple choice model
Anchor1 Anchor2
19
Figure 3.3: Solution approach diagram
approximation of√S2max,i − P 2
ch,i,j(t). g(Pch,i,j(t)) is presented with the multiple choice model
[51] in the optimization formulation. Since both Cch,i + Cpk,i and Si are linear functions with
continuous and integer variables, the fraction term (Cch,i + Cpk,i)/Si can be linearized with the
method proposed in [52]. New variables vi = 1/Si and zi,j(t) = Pch,i,j(t)/Si ∀i, j, t are introduced
and CPEV is reformulated as:
CPEV =∑i∈N
CPEV,i
=τ∑i∈N
{ ∑j∈Mi
T∑t=1
[Rch,j(t)zi,j(t) +Rpk,j(t)wi,j(t)
]+ (1−
∑j∈Mi
xi,j)PTi
}(3.16)
where wi,j(t) = yi,j(t)vi. wi,j(t) is introduced to present the bilinear term yi,j(t)vi after refor-
mulation according to Glover’s linearization scheme. With fraction, bilinear, and quadratic terms
reformulation, the cost function and related constraints in (3.15) are linearized and (3.15) becomes a
multi-objective MILP problem.
39
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
3.4.2 Normalized Normal Constraint Method
Since two objectives are included in the optimization, two anchor points and the Utopia
line are first determined in NNC method. Anchor points are special Pareto points in which only one
of the two objectives is optimized. The objective cost is then normalized according to values of these
anchor points. The line joining two anchor points is called Utopia line. To solve MP Pareto points,
the Utopia line is divided evenly with MP points. At each point on the Utopia line, the related Pareto
point is solved by minimizing one of the two objective costs with a normal line constraint added. The
NNC method transforms the multi-objective optimization to multiple single-objective optimizations
for Pareto points solving. Each transformed single-objective optimization is solved by decentralized
algorithm based on Lagrangian Decomposition for good scalability.
3.4.3 Decentralized Algorithm Based on Lagrangian Decomposition
3.4.3.1 Framework of decentralized algorithm
Decentralized optimization algorithms decompose a large scale problem to subproblems
with smaller scales and solve them simultaneously. Since each subproblem is much easier to solve,
decentralized algorithms are efficient for large scale complex optimizations. The decentralized
algorithm is designed with the framework shown in Fig. 3.4 for solving anchor points and Pareto
points. Three types of primal problems, denoted as PPEV, Putl, and PPm, are for finding two anchor
points and mth Pareto point, respectively. Generally, the number of PEVs is larger than the number
of stations in the scheduling problem. PPEV, Putl, and PPm are decomposed in terms of PEVs for
smaller scale subproblems which can be solved more efficiently. Any constraints coupled among
PEVs should be first relaxed, including the parking capacity constraints, the normal line constraint
introduced by the NNC method, and the nonnegative gap constraint (3.13). Take PPEV for example,
its Lagrangian relaxation LRPPEV has the following formation:
min∑i∈N
CPEV,i +∑j∈M
T∑t=1
λj,t
(∑i∈Nj
yi,j(t)−Acap,j
)(3.17)
where Acap,j is the capacity of station j. The minimization of (3.17) is subject to the same constraints
in the primal problem PPEV except for the relaxed station capacity constraint. λj,t is a non-negative
Lagrangian multiplier. LRPutl and LRPPm are formulated in similar ways.
40
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
The relaxed primal problems, LRPPEV, LRPutl, and LRPPm are then decomposed into
Lagrangian subproblems as LSPPEV,i, LSPutl,j , and LSPPm,i, respectively. After removing constant
terms, the subproblem LRPPEV,i for PEV ower i is:
min CPEV,i +∑j∈Mi
T∑t=1
λj,tyi,j(t) (3.18)
subject to the constraints in LRPPEV relating to PEV i. Similarly, formulations of LSPutl,j and
LSPPm,i can be determined. Each subproblem is a MILP problem and solved independently. With
constraints relaxation, solutions from subproblems may not be feasible for the primal problems.
These solutions form lower bounds (LBs) of the primal problems and need to be restored as feasible
solutions. Feasibility restoration heuristics are designed to recover solutions of subproblems and
obtain upper bounds (UBs) of primal problems. Lagrangian dual problems LDPPEV, LDPutl, and
LDPPm are solved to maximize the LBs. In each iteration, the algorithm tries to reduce the duality
gaps between the LBs and the UBs until the termination condition is satisfied. The Lagrangian dual
problem LDPPEV has the formulation as:
maxλ>0
CLRPPEV(λ) (3.19)
where CLRPPEV is the cost function of LRPPEV and λ is a vector of Lagrangian multipliers. Formula-
tions of LDPutl and LDPPm are determined in similar ways.
3.4.3.2 Subgradient search
Subgradient search is widely used to solve Lagrangian dual problems [53, 54]. It initiates
with selected multipliers and the multipliers are updated iteratively according to subgradients of
constraints, the lower bound, and the best found upper bound of the primal problem in each iteration.
In the rth iteration, αr > 0 is a scalar coefficient determining the step size of multiplier update. If the
low bound or the best upper bound does not improve for a number of iterations, αr+1 decreases with
a discount factor in the next iteration. The algorithm terminates in one of three conditions: the gap
between the best upper bound and current lower bound is smaller than a preset threshold, αr is less
than a small value, or the maximum iteration number is reached.
Take LDPPEV for example, the subgradient search algorithm is shown in Algorithm 3.
γrj,t and sr are subgradient and step size, respectively, in the rth iteration. y∗ri,j(t) is the optimal
PEVs’ parking status obtained by solving LSPPEV,i. CrLRPPEVis the optimal cost of LRPPEV solved
in rth iteration. It is the lower bound of PPEV. UB∗ is the best upper bound found by feasibility
41
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
LSPpev,1 LSPpev,i
Subgradient
SearchUpdate
Decentralized Optimization
Solutions
LB
Feasibility
Recovery
Lagrangian
Dual
UB
UB
RelaxPrimal
Ppev, Putl,
PPm
Lagrangian Relaxation
LRPpev, LRPutl,
LRPPm
DecomposeDerive
LDPpev, LDPutl,
LDPPm
Subproblem
LSPutl,1 LSPutl,i
LSPPm,1 LSPPm,i
...
...
...
Iteration
Figure 3.4: Framework of the decentralized optimization with Lagrangian relaxation
restoration. αr is a scalar satisfying αr > 0. If CrLRPPEVdoes not improve for a number of iterations,
αr+1 decreases with a discount factor in the next iteration. The algorithm terminates in one of three
conditions: the gap between UB∗ and CrLRPPEVis smaller than a preset threshold, αr is smaller than
a small value, or the maximum iteration number is reached.
Algorithm 3 Subgradient search for LDPPEV
1: Let λ0j,t = 0 ∀j, t;2: while termination condition is not satisfied do
3: Collect optimal solutions from all LSPPEV;
4: γrj,t =∑i∈Nj
y∗ri,j(t)−Acap,j ∀j, t;
5: sr = αrUB∗−Cr
LRPPEV(λ)∑
j∈M
T∑t=1
(γrj,t)2
;
6: λr+1j,t = max
{0, λrj,t + srγrj,t
}∀j, t;
7: Send λr+1j,t to all LSPPEV;
8: r ← r + 1;
9: end while
42
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
3.4.3.3 Feasibility restoration
Optimal solutions of Lagrangian subproblems are usually infeasible for the primal problems
and need to be restored with feasibility. The main idea is to check solutions of each subproblem one
by one in terms of PEVs or stations in a predefined order. Solutions from a subproblem are kept if
feasible. After feasibility checking, all subproblems with infeasible solutions are collected and their
solutions are adjusted to be feasible with heuristics. Take PPEV for example, the feasibility restoration
heuristics first orders PEV by increasing values of CPEV,i. The feasibility of each subproblem’s
solutions is checked according to this order. Subproblems with infeasible solutions are collected
and rescheduled later by adjusting charging station assignments or parking intervals. The heuristics
can improve solution qualities, i.e., reducing the objective cost of primal problems. The heuristic
feasibility restoration for the primal PPEV is shown in Algorithm 4.
Algorithm 4 Feasibility restoration for PPEV
1: Let NF = ∅ and ND = ∅;
2: Order all PEVs with increasing value of monetary cost per convenience plus drop penalty;
3: for each PEV ik in order do
4: if∑i∈NF
yi,j(t) + yik,j(t) ≤ Acap,j ∀j, t then
5: Accept the solution from LSPPEV,ik , NF = NF ∪ {ik};6: else
7: ND = ND ∪ {ik};8: end if
9: end for
10: Order the PEVs in IR with increasing value of scheduled parking interval tpk,i;
11: for each PEV i ∈ ND in order do
12: Gradually reduce tpk,i by increasing ts,i or decreasing te,i until it can be scheduled to one of
charging stations or further adjustment is not acceptable for the PEV owner;
13: if PEV i can be scheduled then
14: Accept the scheduling for PEV i;
15: else
16: Discard i without scheduling;
17: end if
18: end for
43
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
3.5 Results and Analysis
The designed algorithms are implemented with Java and MATLAB, and run on a desktop
with Intel i7-2600 CPU and 16GB RAM. Each subproblem is solved by CPLEX in TOMLAB
optimization toolbox[55]. The simulation time is an 11-hour period with one-hour time resolution.
In the simulation, a notional distribution grid is constructed on a part of Boston Back Bay region
shown in Fig. 3.5. It is a 2km × 0.8km commercial area with five types of buildings including
office buildings, apartment buildings, retail stores, restaurants, and storage buildings. Their load
demand and power factors are generated according to some load and power correction study reports
[56, 57, 58, 59]. The total real power load in the area is generated between 10MW and 20MW
along the day time. Power factors are generated between 0.90 and 0.97 according to the load types.
The area is configured with 4 garages and 48 on-street charging stations, each of which consists
of multiple charging spaces and adopts the AC level-2 charging standard. Day ahead time-of-use
(TOU) charging rate is applied to shave peak charging load through scheduling. Charging rates are
set between 0.18 $/kWh and 0.36 $/kWh with peak period between 3pm and 6pm. Hourly parking
prices are set to $2 and $4 for regular and busy streets, respectively. Destinations of PEVs are
generated randomly in the area. PEVs are equipped with 16kWh batteries and on-board chargers
with Smax = 9.6 kVA.
Substtation Low Voltage Load Node On-streeet Charging Sttation Parking Gar
rage
Figure 3.5: Simulation case setup for a distribution feeder and locations of charging stations
To evaluate the performance of proposed system, seven testing cases are simulated with
different configurations of PEV numbers (Npev), total garage charging capacities (Cgar), total on-
44
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
Table 3.1: Parking interval and station capacity configurations for different cases
Case Index Npev Cgar Cstr ηch Tstr β
1 250 40 240 30∼70% 2∼9h 0.4832 350 60 336 30∼70% 2∼9h 0.4803 450 100 432 30∼70% 2∼9h 0.4714 450 100 432 10∼30% 2∼9h 0.4715 350 60 336 30∼70% 2∼5h 0.3426 350 40 240 30∼70% 2∼9h 0.6617 450 40 240 30∼70% 2∼9h 0.840
street charging capacities (Cstr), battery charging requirement as the percentage of battery capacity
(ηch), and on-street parking intervals (Tstr), which are shown in Table. 3.1. For PEV agents with
office buildings as destinations, their parking intervals are relatively long and randomly generated
between 7 hours and 11 hours as regular parking intervals. PEVs with other destinations are
modeled with shorter and more flexible on-street parking intervals Tstr. The PEV density ratio
β = NpevT pk/[Tpd(Cgar + Cstr)] is defined as a metric to describe the PEV penetration level, where
T pkis the average PEV parking internal and Tpd = 11h is the simulation time period. Larger β
indicates charging stations are less sufficient and more conflicts may happen in PEV charging
reservation. Values of β in each case are listed in Table. 3.1. The PEV number in cases 1, 2, and
3 is scaled up with the similar β. In cases 6 and 7, charging station capacities are reduced and β
becomes larger. Case4 represents a scenario in which PEV agents have less charging demand. Case
5 is configured with shorter Tstr, which simulates a busy area with more frequent PEV turnaround.
In each study case, 10 Pareto points are solved. The maximum iteration number for solving
Pareto points is limited to 150. The duality gap (UB − LB)/LB × 100% is defined to evaluate
solution qualities. The iterations for solving an anchor point in case 2 is shown in Fig. 3.6. Along
iterations, the best LB and UB converge to steady values. Cases 2, 3, 6 are considered as examples.
Their solved normalized Pareto frontiers are shown in Fig. 3.7. The m is the index of Pareto points in
which m = 0 and m = 9 are the two anchor points optimizing grid compensation performance and
PEV agent benefits, respectively. As m increases, optimization for each PEV agent benefit gradually
weights more. Duality gaps of these Pareto points are shown in Fig. 3.8 and all below 5%, indicating
satisfying results for the complex MILP problem. The Pareto frontier given in Fig. 3.7 shows that
benefits of PEV agents and the utility grid are generally in conflict, especially for case 6 with larger
β. As m increases from 0 to 2, the objective cost on the PEV side decreases greatly with similar
grid compensation performance. Thus, optimizing only one objective will largely worsen the other
45
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
0 50 100 1500.6
0.8
1
1.2
1.4
1.6
1.8x 104
Iteration Step
Obj
ectiv
e C
ost o
n G
rid S
ide
Best Lower BoundBest Upper Bound
Duality Gap = 1.32%
Figure 3.6: Iterations of solving an anchor point in case 2
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Normalized PEV Objective Cost
Nor
mal
ized
Sta
tion
Obj
ectiv
e C
ost
Case2Case3Case6
m = 9
m = 0
Figure 3.7: Pareto optimal points solved in cases 2, 3, and 6
objective. By selecting appropriate Pareto points, benefits of PEV agents can be largely improved
without sacrificing much on the grid compensation. Among the obtained Pareto points, one can be
selected by an aggregator for the best grid compensation while satisfying PEV benefits.
The influence of parking patterns and station capacities on PEV agent benefits is further
analyzed for cases 2, 5, 6. Figs. 3.9 and 3.10 show the average unit cost per convenience Cu,pev of
scheduled PEVs and total PEV drop penalties PTtotal, respectively, with different amount of average
reactive power compensation. Among the three cases, case 2 can achieve the largest amount of
reactive power compensation because it has both longer average parking interval and larger station
capacity. With moderate reactive power compensation, i.e., between 1.0MVAr and 1.4MVAr, Cu,pev
in case 5 is smaller than that in case 2 when the reactive power compensation is the same. This is
because smaller Tstr in case 5 makes PEV scheduling less competitive and conflictive. However,
46
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
0 1 2 3 4 5 6 7 8 90
1
2
3
4
5
Pareto Point Index m
Dua
lity
Gap
(%)
Case2Case3Case6
Figure 3.8: Duality gaps of Pareto optimal points in cases 2, 3, and 6
Cu,pev increases greatly when reactive power compensation is highly demanded. The limitation of
station capacities in case 6 results in both larger Cu,pev and PTtotal, indicating more PEVs agent
benefits are affected or their charging request has been dropped.
Power analysis is further carried out on the distribution network in the seven cases with
load flow study [60]. The power loss ratio is defined as γploss = (Psub − Pload)/Psub, where Psub
indicates the real power measured in the substation and Pload is the total real power load demand.
The average power loss ratios γploss from 8AM to 18PM are calculated for the seven cases and shown
in Fig. 3.11. In each case, three charging schemes are studied. The first scheme only considers PEV
charging without providing reactive power compensation to the grid. In this case, PEV agent benefits
defined in (3.10) are optimized. The other two schemes correspond to two Pareto points with m = 0
andm = 9. The base γploss without PEV penetration is 0.00638 shown as the dashed line in Fig. 3.11.
Results show that γploss increases when PEVs are charged without reactive power compensation.
However, if reactive power compensation is provided and optimized, up to 9% reduction of γploss
is achieved by the Pareto point m = 0 in case 4. The exact γploss improvement is sensitive to the
number of PEVs, charging station capacities, and PEV charging demand. Among these factors, the
PEV charging demand influences γploss most. With smaller charging demand, more reactive power
can be provided by the PEV chargers during their parking. When the PEV number is increasing
while the total station capacity is kept the same, γploss first decreases and than converges to a steady
value, as reflected by the cases 1, 6, 7. This is because most extra PEVs in case 7 cannot be scheduled
for charging and parking due to the limitation of station capacity as well as will not affect the grid
power loss much. When both PEV number and total station capacity are scaled up with the similar β,
47
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
0.6 0.8 1 1.2 1.4 1.6 1.8 2.010
15
20
25
30
35
Average Reactive Power Compensation (MVAr)
Ave
rage
PEV
Uni
t Cos
t
Case2Case5Case6
Figure 3.9: Average scheduled PEV unit cost per convenience of Pareto points in 3 study cases
0.8 1 1.2 1.4 1.6 1.8 2 2.20
1000
2000
3000
Total Reactive Power Compensation (MVAr)
Tota
l PEV
Dro
p Pe
nalty
Case 2Case 5Case 6
Figure 3.10: Total PEV drop penalty of Pareto points in 3 study cases
the γploss first decreases and than increases, as shown in cases 1, 2, 3. This is because the increase of
PEV charging load causes more power loss, which affects the overall power loss ratio.
48
CHAPTER 3. VEHICLE-TO-GRID REACTIVE POWER COMPENSATION
1 2 3 4 5 6 75.8
6
6.2
6.4
6.6
6.8x 10-3
Case Index
Ave
rage
Pow
er L
oss R
atio
PEV Charging OnlyPareto Point (m = 9)Pareto Point (m = 0)
Figure 3.11: Average power loss ratios of three charging schemes in the 7 test cases
49
Chapter 4
On-road PHEV Power Management in
Vehicular Networks
4.1 Background and Motivation
In existing offline PHEV power management systems, deterministic or stochastic optimiza-
tion problems are formulated to obtain optimal power strategies based on historical driving cycles.
These strategies will be used in future driving. Online systems make power decisions according to
real-time driving states. Many PHEV power management systems are based on the powertrain model
with PSD [61, 62, 63, 64]. PHEV fuel efficiency can be optimized through controlling the PSD gear
ratio. For offline PHEV power management, [65] and [66] utilize historical traffic cycles to optimize
the fuel consumption with dynamic programming (DP) in temporal and spatial domain, respectively.
In [66], the authors use a segment based road model to reduce computational complexity and obtain
a closed-form. Multiple information including the slope grade, speed, and acceleration/deceleration
are obtained from historical data for a selected route. However, no stochastic driving cycles or traffic
conditions are considered in the models for real drivings. In [67], the power management strategy is
represented by a pair of power parameters describing the threshold for ICE and battery power control.
The optimization problem is solved analytically. The solutions are optimal in a statistical sense but
not for an individual trip. [64] proposes a stochastic optimal control approach for PHEV power
management based on Markov decision process (MDP). However, the MDP is modeled with infinite
horizon and can hardly be applied to applications which are sensitive to the trip length, like trip fuel
consumption optimization. In summary, offline power management systems are usually limited by
50
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
historical driving cycles and cannot adapt to real driving conditions for optimal performance.
For online power management, systems are designed with various control methods. A
fuzzy controller is developed in [68] to determine the power output split between EM and ICE. Its
proposed baseline control strategy makes ICE work near optimal operation line and optimizes both
fuel consumption and emission. A rule-based supervisor equivalent fuel consumption control system
is proposed in [69] to minimize the equivalent fuel consumption. In [70], a model predictive control
system combined with statically solved power set points is designed to minimize fuel consumption
and emissions. [71] proposes a power-balancing strategy for a parallel PHEV. The ICE operation is
controlled in the peak-efficiency region. This strategy does not rely on a prior trip information and
can be easily applied for on-road application. Input to these online systems include the real-time
pedal position, battery SOC, and vehicle speed. Decisions are optimal torque/power splits between
ICE and EM according to real-time driving states. These techniques rely on analysis of PHEV
powertrain models, e.g., ICE and EM efficiency maps, integrated starter generator, and ICE optimal
operation line, without utilizing trip information such as driving routes and cycles. Thus, these
systems lack the overview of entire trips. Their power decisions are optimal for individual driving
states, but not for specific trips.
Few works study the integration of online and offline PHEV power management in the
context of vehicular network, where extra real-time information, i.e., vehicle speed prediction, is
available to be used for optimal on-road power management. Our proposed system leverages such
information in a two-level hierarchical power management scheme.
4.2 System Design
4.2.1 Overview of the System
The scheme for proposed on-road PHEV power management CPS is shown in Fig. 4.1. It
consists of smartphones and PHEVs’ powertrain as the physical part. The smartphone is capable
of wireless communication through embedded modules (WiFi, 3G, Bluetooth). It is also equipped
with GPS navigation system, accelerometer/gyroscope, and high capacity data storage. It serves
as a mobile in-situ vehicle state sensor, a communication device, and a computation unit running
power management algorithms. PHEV powertrain includes ICE, motors/generators (M/Gs), battery,
PSD, etc. The cyber system includes the vehicular network for traffic measurement and vehicle
speed prediction, power management algorithms, and real-time/historical driving information. It is
51
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
assumed that traffic information of urban arterials and freeways is retrieved from vehicular network
by smartphones. Then vehicle average speed prediction is made in real-time which will be utilized
by the power management system. The GPS navigation system is used to obtain the driving route
and location information. PHEV’s driving states, e.g., speed and acceleration, are measured by the
smartphone’s embedded sensors. These information is combined together to generate driving traces
and stored in smartphones for later modeling.
MDP Policy
Smartphone
Historical Driving Cycles
VANET
Traffic Prediction
Traffic Measurement
Power Management Algorithm
Speed Prediction Energy Budget
Offline
Online
Acquisition
Actuation
Status Update
Cyber System
Physical System
Wireless
Navigation Sensors
Storage
PHEV
ICE
M/G1
M/G2
Figure 4.1: Scheme of on-road PHEV power management system
The PHEV power management system is designed with two-level hierarchical optimiza-
tions, a high-level online and a low-level offline optimization to reduce the computational complexity
for on-road applications. To achieve minimum fuel consumption for a trip, the overall battery energy
consumption along the route should be first regulated. This may not be necessary for a short trip
when the battery energy is sufficient to sustain the entire trip. However, for mid or long-distance
driving, the limited battery energy should be well allocated, as energy budget, to each road according
to the varying average fuel efficiency determined by the vehicle driving conditions. When the
driving conditions result in low fuel efficiency, more battery energy should be discharged for M/G
torque generation rather than directly driven by ICE, and vice versa. These decisions should also
be updated in real-time to dynamically adapt to the battery state of charge (SOC), driving route
52
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
change, and average driving speed change. Thus, battery budgets are generated online with the
utilization of real-time vehicle speed prediction information. On the contrary, optimal powertrain
operation policies, i.e., the ICE speed and torque split ratio under different driving states, do not
change during driving because they are determined by physical characteristics of the powertrain.
Therefore, low-level powertain operation policies are generated offline based on historical driving
cycles. Plugging in battery budgets and real-time PHEV driving states, real-time power management
decisions are made by looking up the solved policy tables.
4.2.2 System Models
The PHEV power management CPS is based on three important parts: PHEV powertrain
with PSD, unit cycles in spatial domain, and power management algorithms. We next describe them
in detail.
4.2.2.1 PHEV Powertrain with PSD
The PHEV powertrain is configured with a PSD and will be used in the low-level power
management. The PHEV powertrain model diagram shown in Fig. 4.2 includes powertrain com-
ponents, power flows, and torque flows. The major powertrain components include an ICE, two
M/Gs, a planetary gear, an inverter and a battery pack. The two M/Gs differ in their sizes. M/G2
has a larger power output and provides traction torque to the car together with the ICE. Its another
function is to recharge the battery through regenerative braking. With a smaller scale, M/G1 works
as a power generator to drive M/G2 or charge the battery. As the PSD, the planetary gear connects
ICE, M/G1 and M/G2 and splits the ICE torque output TICE into two parts, TICE,1 and TICE,2.
TICE,1 is applied to M/G1 to generate electric power PM/G1. TICE,2 is applied directly to the final
drive shaft to meet PHEV’s torque demand Tfd together with M/G2 toque output TM/G2. PM/G1
is first provided to M/G2 for torque generation. If power demand PM/G2 of M/G2 for torque
generation is larger than PM/G1, extra power PB will be drawn from the battery. On the other hand,
if PM/G2 < PM/G1, the remaining part of PM/G1 will charge the battery.
The PSD enables the powertrain with two degrees of control freedom. ICE speed ωICE
and M/G2 torque generation TM/G2 are selected as control variables for the low-level optimization.
The following constraints from the powertrain model with PSD can be derived for the low-level
optimization formulation:
53
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
ICE M/G1
M/G2
Battery Fuel Tank
Final Drive Shaft
Fuel Input
Torque Flow Power Flow
Planetary Gear
ICET ,1ICET
BP
fdT,2ICET
,2ICET
,2
/ 2
ICE
M G
T
T
/ 1M GP
Figure 4.2: A PHEV model with PSD
TICE = (1 + ρ)(Tfd − TM/G2)
TICE,1 = ρ(Tfd − TM/G2)
ωM/G1 =ωICE(ρ+ 1)−Kωwh
ρ
ωM/G2 = Kωwh
(4.1)
where ρ = Ns/Nr. Ns and Nr are the teeth number of sun gear and ring gear of PSD, respectively.
ωM/G1, ωM/G2, and ωwh are the speed of M/G1, M/G2 and wheel (rad/s), respectively. The wheel
speed is determined by the vehicle speed. Tfd is the required torque on the final shaft for such speed.
K is the final drive ratio. Additional constraints include torque and speed limit of ICE, M/G1, and
M/G2, and the battery charging/discharging power limit. The ICE fuel consumption flow ˙fuel (g/s)
is included in the objective function as:
˙fuel =TICEωICEηICEHl
(4.2)
where Hl is the lower heating value of fuel (J/g). ηICE is the ICE fuel efficiency. ηICE has nonlinear
relationship with TICE and ωICE , and can be determined by looking up the fuel efficiency map. The
battery is approximated as a voltage source with an internal resistance. The change of battery SOC
through charging/discharging can be presented as:
˙SOC = − IbQb
= −Voc −
√V 2
oc − 4RbPb
2RbQb(4.3)
where Ib is the battery discharging/charging current (positive/negative value). Voc is the battery
open-circuit voltage. Rb is the battery internal resistance. Pb is the related battery power exchange.
54
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
Qb is the battery charge capacity. The detailed objective function formulation will be discussed in
Section 4.3.2.
4.2.2.2 Unit Cycles in Spatial Domain
In most existing PHEV power management systems, power decisions are generated in
the temporal domain (e.g., power splits from ICE and M/G at different time slots in a trip). It is
convenient for the modeling and performance analysis. However, it is hard to apply them for on-road
power management since travel time on each road is highly varying. First, the vehicle speed is
dynamic and affected by many elements in stochastic ways. Second, at each signal intersection, it
is difficult to retrieve or estimate the waiting time. Power management decisions in the temporal
domain can hardly match the real-time driving states. Alternatively, as far as the driving route is
given, the geographical topologies of traveling roads and the total driving distance will not change.
Therefore, PHEV power management in this paper is formulated in the spatial domain. Historical
driving cycles in the time domain are converted to the spatial domain for modeling. The speed is
recalculated as the average value in corresponding time slots.
A driving cycle consists of roads with different length. For the modeling convenience,
the whole driving cycle is decomposed into unit cycles, each of which represents an urban road
segment (arterial or local road) between two intersection or a segment of freeway, as shown in Fig
4.3. Because urban roads usually have different lengths, five typical lengths, from 0.1 to 0.5 mile
with corresponding length index from 1 to 5, are used to represent urban unit cycles. Unity cycles
shorter than 0.1 mile or longer than 0.5 mile are modeled with index 1 or 5, respectively. Affected
by intersections and traffic signals, driving speed characteristics on urban roads usually vary at
different locations. To model driving cycles more accurately, an urban unit cycle is divided to three
sections, A1 for departure, A2 for cruising, and A3 for arrival. A1 is the distance from the stop line
of upstream intersection to the location where cruising speed is usually achieved. Speeds in A1 have
large probability to transfer from low to high values. On the contrary, A3 presents the deceleration
section where PHEVs approach to the stop line of downstream intersection. Lengths of A1, A2, and
A3 are chosen according to historical driving cycles. Because freeways are not separated naturally
by intersections, the driving distance on a freeway is decomposed into N unit cycles, each of which
has fixed length of 0.2 mile. A freeway unit cycles will be one of three types, entering (acceleration,
U1), cruising (from U2 to UN−1), and exiting (deceleration, UN ). For an urban or freeway unit cycle,
each 0.1 mile length is further discretized into 20 slots.
55
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
Distance
Speed
Departure Arrival Cruising
1A 2A3A
(a) Urban unit cycle
Distance
Speed
Entering Existing Cruising
1U 2U 3U 1NU NU
(b) Freeway unit cycle
Figure 4.3: Unit cycle models for urban roads and freeway
4.3 HIERARCHICAL POWER MANAGEMENT ALGORITHMS AND
SOLUTIONS
4.3.1 Hierarchical Power Management Algorithms
The scheme of the proposed PHEV power management system is shown in Fig. 4.4. The
objective is to minimize total fuel consumption of the entire driving trip. The high-level management
allocates battery energy budgets to unit cycles according to real-time vehicle speed prediction.
These decisions are made online with updates when new prediction information becomes available.
Differently, low-level power management strategies give out optimal TM/G2 and ωICE according
to real-time driving states. Solving the low-level problem is computational difficult on account of
56
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
PHEV Real-time
State
MDP
Policy Battery
Energy
Budget
Historical
Driving Cycles
Unit Cycles with
Spatial Index
Real-time Vehicle
Speed Prediction
Quadratic Model
Stochastic Quadratic
Programming
PHEV Powertrain
Model
Markov Decision
Process
High-level Online Low-level Offline
Data
Model
Optimization
Strategy
Decision
* ( )efuel g P
/ 2,M G ICET
, reqV T
Figure 4.4: PHEV hierarchical mode for PHEV power management
nonlinear powertrain models and a large set of historical driving cycles. Thus, low-level strategies
are generated offline. At both two levels, strategies are made and executed through five layers.
The first layer is the data input layer. In the online mode, real-time traffic speed prediction
from vehicular network is used for battery energy budget generation. A driver’s future average speed
on a road is approximated as the same as the road’s average traffic speed prediction. It is assumed
that next 30-minute traffic speed predictions are available for freeways and urban arterials in the form
of expectations and distribution probabilities, which is reasonable according to recent research results
[72, 73]. For other roads without prediction, their historical average speeds and speed transition
probabilities are used. In the low-level offline mode, historical driving cycles are input data for
the PHEV powertrain modeling. From driving cycles, torque demand is calculated by considering
friction force, aerodynamic force, and acceleration force [74].
Input data are then applied to models in the second layer. The online mode uses a quadratic
model to simplify the nonlinear relationship between optimal achievable average fuel rate ˙fuel∗
and
57
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
battery budget power Pe (budget energy divided by travel time in the unit cycle) for a unit cycle. It is
formulated as ˙fuel∗
= a2P2e + a1Pe + a0 where a2, a1, and a0 are coefficients to be determined
through curve fitting. The quadratic model is a trade-off between accuracy and computational
complexity [74]. Samples for quadratic model fitting, i.e., ˙fuel∗
and Pe pairs, are solved offline
from historical driving cycles by using DP. Quadratic models are then fit separately for different
types of unit cycles, i.e., with different length indices and average speed, from DP solutions in a least
square sense. The low-level offline models include the PHEV powertrain model and the spatial unit
cycle model.
With both data and models, power optimizations are formulated in the third layer. The high-
level is formulated as a multi-stage stochastic quadratic programming (MSQP) to generate battery
energy budgets for unit cycles in a trip. With energy budgets as constraints, the low-level problem is
formed as the finite-horizon MDP and the MDP policies for TM/G2 and ωICE are generated offline.
Finally, during on-road driving, ωICE and TM/G2 decisions are looked up from MDP policy tables
according to real-time generated battery budgets, driving states, and road length indices.
4.3.2 Optimization Formulation and Solutions
With the two-level power management system, both the high-level online and the low-
level offline power managements are formulated as optimization problems and solved by efficient
algorithms.
4.3.2.1 Online Stochastic Quadratic Programming for Battery Budget Generation
In the high-level online power optimization, a stage is defined as a unit cycle. Future
traffic speeds are random variables. We can obtain their prediction but not full information until
their realizations, i.e., a PHEV is entering the next unit cycle and its traffic speed is measured
instantaneously. Thus, the optimization should be formulated with probabilistic descriptions of traffic
speeds, e.g., probability distributions and densities, to incorporate their effects on optimal budget
decisions. As a PHEV is driving, the high-level power management is done through sequential
decisions in multiple stages from the start to the end of a trip. The diagram of the stochastic
optimization is shown in Fig. 4.5. When a new traffic speed is disclosed, it is desired to use its
realization to generate energy budgets for future unit cycles. The reaction to the realization of random
variables for future decisions is called recourse. Thus, the high-level power management problem
can be described as: at the end of stage k, given current battery SOC, traffic speed measurement of
58
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
stage k+ 1, and vehicle speed prediction of future stages in the trip, decide the battery energy budget
for stage k + 1 in order to minimize PHEV’s fuel consumption in stage k + 1 plus the expected total
fuel consumption in other remaining stages. The generated battery energy budget for stage k+ 1 will
then be applied to the low-level power management as the constraint. We observe that the problem
has the following features: 1)The problem has a convex and polyhedron constraint space; 2)The cost
function is quadratic and continuous; 3) Traffic speed prediction is assumed to be obtained from
a vehicular network. Thus, it is appropriate to model this optimization problem as a MSQP with
recourse. Existing mathematic programming techniques for MSQP can solve the problem with global
optimal solutions and with fast speed.
Stage k
Traffic
Prediction
Battery Capacity
Energy Budgets
Stage Update
Constraint
Stochastic
Process
Stochastic Optimization
Optimal
Solution
s Minimize
Input
𝑘 ← 𝑘 + 1
Figure 4.5: Diagram of stochastic programming for online PHEV power management
The distribution of stochastic vehicle speed, obtained from vehicle speed prediction, is
presented with a finite number of scenarios. To solve the problem online, the selection of scenario
number has to consider the trade-off between the solution quality and computational complexity.
Short-term speed prediction, i.e., within next 10 minutes, usually has small root mean square error
(RMSE). So the number of scenarios in a stage is set to one by only using the predicted expectation.
For the long-term prediction between next 10 to 30 minutes, three scenarios are used to present the
two-sigma range of its probability. For MSQP with stage k as the first stage, the control variable is
the battery energy budget EBk. The stochastic average speed is represented as a random variable Vk.
The MSQP is formulated as:
min zk(EBk, vk) = Tkgvk,il,k(EBk/Tk)
+ EVk+1|vk[Qk(EBk, Vk+1)
] (4.4)
59
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
whereEBk = (EB1, EB2, ..., EBk)
vk = (v1, v2, ..., vk)(4.5)
EBk and vk denote the sequence of budget decisions and realization of random variables V up to
stage k, respectively. gvk,il,k(EBk/Tk) is the optimal achievable fuel consumption rate in stage k
with length index il, traffic speed vk, and battery power budget EBk/Tk. It is calculated from the
quadratic fuel consumption rate models. Tk = dk/vk is the travel time in stage k and dk is the length
of the unit cycle. Qk(EBk, Vk+1) =min zk+1(EBk+1, vk+1). Its conditional expectation given
V k = vk is denoted as EVk+1|vk[Qk(EBk, Vk+1)
]. Constraints include that after allocating budget
EBk to stage k, the remaining battery SOC should be higher than the minimum value. The MSQP
described in (4.4) is solved with global optimal solutions by the quadratic nested decomposition
algorithm [75, 76]. This algorithm evolves from a Newton-type method for solving piecewise
quadratic programming.
The MSQP described in (4.4) is solved by the quadratic nested decomposition algorithm
[75, 76]. This algorithm evolves from a Newton-type method for solving piecewise quadratic
programming. Three assumptions should be satisfied to solve a MSQP with this algorithm: 1) The
number of scenarios in each stage is finite; 2) Control variables have polyhedral convex sets; 3) The
quadratic term of the cost function in each stage is positive semidefinite for all scenarios. When
satisfying these assumptions, the algorithm terminates in a finite number of iterations by obtaining
global optimal solutions or detecting unbounded solutions. Our MSQP formation (4.4) satisfies all
these requirements and can be solved with the algorithm.
4.3.2.2 Offline PHEV Power Policy Generation
In the low-level offline power management, PHEV power policies are solved for unit cycles
by using historical driving data. PHEV power policies map driving stages to optimal ωICE and
TM/G2 decisions. Optimal ωICE and TM/G2 decisions are sensitive to the lengths of unit cycles,
vehicle average speed, and battery budget allocation. Thus, policies are differentiated for unit cycles
with different length indices (from 1 to 5 for urban unit cycles) and speed level (low or high, with
average speed 25 MPH and 50 MPH as thresholds for urban and freeway unit cycles, respectively).
Because a PHEV’s driving speed profiles in a unit cycle are different in different trips, the power
policies minimize its expected fuel consumption in the unit cycle.
60
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
The driving speed and torque demand of a PHEV in one time slot is mainly determined
by the driver’s pedal/throttle command and vehicle speed in the previous time slot. Thus, both the
speed and torque requirement are modeled as Markov chains. The problem of solving the optimal
low-level power policies for unit cycles is modeled as a finite-horizon MDP. The MDP can be
described as, given the battery energy budget constraint and vehicle speed level, finding the optimal
M/G2 torque and ICE speed policy to minimize the total expected fuel consumption in the unit cycle.
ICE’s on/off state optimization is not included in the MDP for simplicity. Instead, it is controlled
by rule based strategies, i.e., ICE is turned off after a period of car waiting. An MDP stage is a
road slot reflecting its spatial granularity. An MDP state xk in stage k includes PHEV’s speed vk
, torque demand on the wheel Tw,k, and the remaining energy budget percentage qk. An action ak
incorporates M/G2’s torque output TM/G2,k and ICE’s speed ωICE,k. State transition probabilities
p(x, x′) = Pr(xk+1 = x′|xk = x) are differentiated for low and high speed level and learned by
using the maximum likelihood estimation method [77]. The reward rk(xk, ak) for action a in stage
k is designed as the negative fuel consumption. The MDP maximizes total expected rewards with the
policy π:
Eπ
{ N∑k=1
rk(x, a)
}(4.6)
wherexk = (vk, Tw,k, qk)
ak = (TM/G2,k, ωICE,k)
rk(xk, ak) =ωICE,kLsηICEHlvk
(1 + ρ)(Tfd,k − TM/G2,k)
(4.7)
Ls is the length of a stage. The finite-horizon MDP is usually solved with the backward induction
method [78]. In the last stage N , the reward rN (x, a) is maximized. From stage N − 1 to the first
stage, in each stage k, the value function Vk(x) is computed with optimal action a as:
Vk(x) = maxa∈A{rk(x, a) +
∑x′∈X
p(x, x′)Vk+1(x′)} (4.8)
where X and A are the state and action space, respectively. πk,il,j,EB(x) is the power management
policy generated for stage k in the unit cycle with length index il, speed level j (0 or 1 for high or
low level), and battery budget EB. According to vehicle’s driving stage xk, power decisions are
looked up from policy tables as ak = πk,il,j,EB(xk).
61
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
Table 4.1: Configuration of PHEV Powertrain with PSD
Vehicle Total Mass 1486 kgICE Maximum Power 43kW @ 4000 rpm
Peak Torque 101.7 N*m @ 4000 rpmIdling Speed 1200 rpm
M/G1 Maximum Power 15 kWPeak Torque 55 N*m
@ -2500 rpm∼2500 rpmMax Speed ±6000 rpm
M/G2 Maximum Power 30 kWPeak Torque 305.0 N*m
@ 0∼940 rpmMax Speed 6000 rpm
Battery Capacity 6Ah × 308V
4.4 RESULTS AND ANALYSIS
The simulation platform is built with Java, Matlab, and ADVISOR simulator [79], where
the power management algorithms are implemented with Java and Matlab, and ADVISOR is a
common vehicle simulation software [79]. The Toyota Prius powertrain parameters are obtained from
ADVISOR and shown in Table 4.1. The proposed power management method is built and validated
on eight standard driving cycles in ADVISOR. Seven of them, including HWFET, INRETS, LA92,
NYCC, SC03, SC06, and UNIF01, are used for learning the speed transition probabilities and fitting
the quadratic models. The remaining UDDS cycle is used for the power management performance
evaluation. To simulate the scenario of on-road driving, stochastic driving cycles are generated
based on UDDS and speed transition probabilities, and adjusted according to future vehicle speed
prediction. The vehicle speed prediction is presented in the form of speed probability density function
(pdf). Without loss of generality, it is assumed that the pdf has the normal distribution[80, 81], where
the prediction result and RMSE are used as the estimation of the distribution mean and standard
deviation, respectively. The vehicle speed scenario probabilities required in the high-level MSQP are
calculated from the vehicle speed prediction models.
For system evaluation, we analyze system modeling results, performance of power de-
cisions in a randomly selected driving cycle, and fuel consumptions in different test cases. First,
low-level models learned from the seven historical driving cycle are checked. Speed transition
probabilities of urban road driving in different sections are shown in Fig. 4.7. Each grid represents
62
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
0 2 4 6 80
10
20
30
40
50
60
Driving Distance (Mile)
Spee
d (M
PH)
(a) UDDS
0 2 4 6 80
10
20
30
40
50
60
70
Driving Distance (Mile)
Spee
d (M
PH)
(b) Stochastic driving cycle
Figure 4.6: UDDS driving cycle and a sample of generated stochastic driving cycle in spatial domain
a speed transition instance from current stage to next stage in the spatial domain. Speed transition
characteristics are different in the three sections. In departure and arrival sections, most transition in-
stances are above and below the diagonal, respectively. In the cruising section, speeds are maintained
stable with high probabilities and transition instances locate around the diagonal. The UDDS driving
cycle and a sample of stochastic driving cycles in the spatial domain are shown in Fig. 4.6a and
Fig. 4.6b, respectively. Fig. 4.8 shows examples of M/G2 torque and ICE speed policies generated
from MDP. Vveh and Tfd indicate vehicle’s speed and torque requirement on the final drive shaft,
respectively. αM/G2 is the ratio of M/G2 torque output to Tfd. ωICE is the ICE speed normalized to
the maximum ICE speed. Due to the limited number of driving cycles for training, some states are
not covered and their related αM/G2 and ωICE are assigned with negative values. For these states,
their policies are approximated to their adjacent neighbors. If policies of their neighbors are also
63
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
(a) Transition probabilities for vehicle departure (b) Transition probabilities for vehicle cruising
(c) Transition probabilities for vehicle arrival
Figure 4.7: Speed transition probabilities for urban roads with light traffic
unavailable, charging-depleting/charging-sustaining (CDCS) strategy will be applied. In the CDCS
strategy, M/G2 generates torques and supplies to the PHEV demand if the battery SOC is sufficient
(charging depleting). As the battery SOC level decreases to a low-level, both ICE and M/G2 provide
torques and the battery SOC in maintained within a preferred range (charging sustaining). Take
the M/G2 torque policy shown in Fig. 4.8a as the example, α is large in regions with low torque
demand and high driving speed because of the high M/G2 efficiency. When the driving speed is low,
a larger part of torque demand is generated by ICE. Because the M/G2’s rotary speed is the same as
the driving speed and its efficiency decreases as its rotary speed decreases, battery energy is saved
for future usage when M/G2’s efficiency is high. Similarly, the ICE speed policy in Fig. 4.8b gives
ωICE decisions for the maximum fuel efficiency.
64
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
For the low-level power management evaluation, the expected fuel consumption in a unit
cycle with the MDP policy, as −V0(x) defined in (4.8), is compared with that with CDCS strategy.
Results are shown in Fig. 4.9. When the battery energy budget is small, MDP policy is more
fuel-efficient than CDCS. As the battery energy budget becomes large enough to sustain M/G2’s
torque generation in the whole unit cycle, CDCS has the same performance as the MDP policy.
Vehicle speed prediction accuracy and available battery energy for discharging are two
important elements affecting power management decisions in the proposed system. Vehicle speed
prediction accuracy and available battery energy for discharging are two important factors affecting
power management decisions in the proposed system. To evaluate the performance of the proposed
two-level power management systems, five cases are tested with different configurations of vehicle
speed prediction and initial battery SOCs. Three cases have the same initial battery SOC 0.7 but
different vehicle speed prediction errors with RMSE 3 MPH, 5 MPH, and 8 MPH, respectively. The
other two are assumed with vehicle speed prediction RMSE= 5 MPH and start with the battery
initial SOC of 0.9 and 0.5, respectively. 10 stochastic driving cycles are tested in each case. The
performance of our proposed method, denoted as the MSQP/MDP, is compared with other four
methods. The second method only utilizes the vehicle speed prediction expectation in the high-level
problem without considering its distribution information. In this way, the high-level optimization
is simplified as a quadratic programming (QP). This method is denoted as the QP/MDP. In the
third method, the battery budget generation is not optimized. Instead, the low-level MDP model
is provided with battery budget as large as possible according to the battery SOC. This method is
called MDP Only. The fourth method is the CDCS. The last method is denoted as Static, which
solves power management decisions offline based on the UDDS cycle with DP and then applies
decisions to testing driving cycles. Even though UDDS and the testing cycles have the same route,
static decisions can not be always applied to the testing cycles because their driving states may be
different. For example, at the same location in UDDS and a testing cycle, PHEV may decelerate
without torque output in the former but require torque generation for acceleration in the latter. In
these situations, CDCS method is applied instead.
The high-level online MSQP in the proposed MSQP/MDP method can be solved within 3
seconds. A test sample is selected randomly and its results of torque generation and SOC profile are
studied in detail. The torque requirement on the final drive shaft is shown in Fig. 4.10. The torque
outputs of ICE and M/G2 and the fuel consumption alone driving distance with MSQP/MDP and
CDCS are shown in Fig. 4.11. With the CDCS strategy, less fuel is consumed at the beginning of
the driving cycle and more torque is generated from M/G2. However, the battery is depleted fast
65
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
(a) M/G2 torque policy map
(b) ICE speed policy map
Figure 4.8: MDP policy maps for an urban unit cycle with length index il = 2, stage index k = 15,
light traffic, and remaining battery budget 0.015 kWh
and fuel should be consumed for torque generation in the remaining driving cycle, even though the
fuel efficiency is low. The total fuel consumption of CDCS is larger than that of MSQP/MDP. In
our proposed method, torque outputs of ICE and M/G2 are well balanced to minimize the total fuel
66
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
0 0.02 0.04 0.06 0.08 0.1 0.12 0.142
4
6
8
10
12
14
16
Battery Energy Budget (kWh)
Expe
cted
Tot
al F
uel C
onsu
mpt
ion
(gra
m)
MDPCDCS
Figure 4.9: Expected fuel consumption of an urban unit cycle (Length index il = 2) with MDP and
CDCS.
consumption in the trip. The battery SOC profiles in the test sample with the four methods are shown
in Fig. 4.12. Different from the CDCS method, the proposed MSQP/MDP has a good battery energy
scheduling along the driving cycle.
We further check the ICE operation points on the ICE efficiency map, which is shown
in Fig. 4.13. Different fuel efficiency levels, e.g., from 0.15 to 0.4, are shown as contours in Fig.
4.13. The ICE optimal operation line is defined as a set of operation points which consume the
lowest fuel and provide a constant power output [82]. As shown in Fig. 4.13, operation points of the
proposed MSQP/MDP are close to the optimal operation line, which means high fuel efficiencies
are achieved with less fuel consumption. On the contrary, operation points in CDCS are widely
distributed in the ICE fuel efficiency map. This is because the battery is depleted in early stages with
the CDCS method and ICE has to consumes fuel and work in regions with low fuel efficiency in
order to generate enough torque.
Average fuel consumptions are further compared between the four methods in the five
cases and results are shown in Fig. 4.14. In each case, the average fuel consumption of 10 tests
is calculated and normalized to the cost of MSQP/MDP. The proposed MSQP/MDP outperforms
other methods in all cases in terms of fuel consumption while the CDCS method has the worst
performance. Fig. 4.14a shows that differences of fuel consumption between MSQP/MDP and
QP/MDP become larger as the prediction RMSE increases. This is because when the vehicle speed
prediction is inaccurate with larger RMSE, only using prediction expectation in the high-level budget
generation is not enough to solve optimal decisions. On the other hand, with less accurate prediction,
i.e., RMSE=8 MPH, the fuel consumption difference between MSQP/MDP and static method is
67
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
0 2 4 6 8-400
-200
0
200
400
Driving Distance (Mile)
Torq
ue (N
*m)
Figure 4.10: Required torque on the final drive shaft
small. This is because the utilization of inaccurate prediction in high-level management cannot
generate optimal energy budgets and won’t improve the system performance significantly. Fig. 4.14b
shows that the difference of fuel consumptions between MSQP/MDP, Static, and CDCS becomes
smaller as the initial battery SOC increases. This indicates that the management does not contribute
much to fuel reduction when battery energy is sufficient and CDCS is near optimal. As the initial
SOC increases, the performance of QP/MDP decreases gradually and is outperformed by the MDP
Only. This shows that the high-level decision sub-optimality can be amplified by the large amount of
available battery energy. The QP/MDP method fails to schedule battery budgets optimally and leaves
much battery unused at the end of the trip.
68
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
0 2 4 6 80
50
100
Driving Distance (Mile)
(a) ICE torque output (MSQP/MDP)
0 2 4 6 80
50
100
Driving Distance (Mile)
(b) ICE torque output (CDCS)
0 2 4 6 8-300
-150
0
150
300
Driving Distance (Mile)
(c) EM torque output (MSQP/MDP)
2 4 6 8-300
-150
0
150
300
Driving Distance (Mile)
(d) EM torque output (CDCS)
0 2 4 6 80
0.5
1
1.5
2
Driving Distance (Mile)
(e) Fuel consumption (MSQP/MDP)
0 2 4 6 80
0.5
1
1.5
2
Driving Distance (Mile)
(f) Fuel consumption (CDCS)
Figure 4.11: ICE and EM torque output and fuel consumption along distance in MSQP/MDP and
CDCS
69
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
0 2 4 6 80.2
0.3
0.4
0.5
0.6
0.7
0.8
Driving Distance (Mile)
SOC
MSQP/MDP QP/MDP Static CDCS MDP Only
1. 2.
1.
2.
3.
4.
5.
3.5.4.
Figure 4.12: Battery SOC along driving distance in a sample test driving cycle
OPL × 2-level MSQP/MDP OP ○ CDCS OP
Figure 4.13: Operation points on ICE efficiency map
70
CHAPTER 4. ON-ROAD PHEV POWER MANAGEMENT IN VEHICULAR NETWORKS
(a) Fuel consumption comparison with initial SOC 0.7 and different vehicle speed
prediction RMSE
(b) Fuel consumption comparison with vehicle speed prediction RMSE 5 MPH
and different initial battery SOC
Figure 4.14: Fuel consumption comparison
71
Chapter 5
Traffic and Vehicle Speed Prediction in
Vehicular Networks
5.1 Background and Motivation
Traffic predictions include the prediction of traffic flow, average traffic speed of selected
roads, and average travel time on selected routes. Predictions are more important for freeways
and arterials which have large flow and speed variation in a day or at the same time for different
days. Traffic prediction methods can be categorized into two types: model based prediction and data
driven prediction. Model based prediction uses traffic models, such as vehicular density, vehicular
flow and individual vehicle trajectories [83, 84, 85, 86, 87], to describe future traffic conditions.
These methods require complex computation and extensive on-site calibration and are difficult to
implement. On the other hand, data-driven methods rely much on the input traffic data and try to find
the relationship between future data and historical one. They are easy for implementation and can be
adaptive to changing traffic conditions. Existing data-driven traffic prediction models are based on
historical average, time series analysis, NN, and nonparametric regression.
In [88], the authors compare the above four models in freeway traffic flow prediction.
Historical average is the simplest model for implementation but has the largest average absolute
error. The autoregressive integrated moving average (ARIMA) model, as a widely used time series
model for prediction, can be easily implemented based on existing techniques. But it is hard to
handle missing values and is just slightly better than the historical average model. NN has the
second best results and is suitable for nonlinear relationship prediction, which, however, requires a
72
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
complex training process. The nonparametric regression with nearest neighbor formulation has the
best performance with smallest average absolute error and its error is well distributed. It highly relies
on the quality of the used database and recognizing neighbors is also complex. Work [89] combines
ARIMA with Generalized Autoregressive Conditional Heteroscedasticity (ARIMA-GARCH) to deal
with non-constant conditional variance in one-step (5-15 minute) freeway traffic flow prediction.
Results show ARIMA-GARCH provides additional information, i.e., the time-variant confidence
interval, and reaches similar prediction accuracy as ARIMA. Work in [90] proposes NN with road
clustering for traffic speed prediction. The NN training time is reduced by utilizing the correlation
in clusters. Results show that the proposed method gives more accurate prediction than time-series
methods and the binary NN method proposed in [91]. Different prediction intervals (from 5 minutes
to 30 minutes) and two traffic conditions (congestion and non-congestion) are explored in [92] with
NN. For individual vehicle speed prediction, work in [93] uses a constant percentile to predict the
vehicle speed, which assumes drivers have their preferred speed at different locations and tend to stay
the same for each drive. This method suffers low accuracy since the influence from traffic conditions
on vehicle speed is not considered. Work in [94] proposes a vehicle cruising speed prediction method
based on non-parametric kernel density estimation (KDE) and parameterized launching models. The
prediction system is designed with low complexity, but it can only predict vehicle’s speed 20 seconds
ahead and is limited to specific road types.
5.2 System Description
It is assumed that all vehicles studied in this system are connected through on-board
smartphone communications in a vehicular network. Their driving data are measured in real time
by smartphone embedded sensors and stored in memory cards. Driving data are uploaded to the
cloud regularly and aggregated there to calculate the real-time traffic speed and flow information.
On account of the significant effect of traffic condition on vehicle speed, the accuracy of vehicle
speed prediction can be greatly improved by utilizing future traffic conditions. Thus, the vehicle
speed prediction system is designed as a two-level scheme shown in Fig. 5.1. The first-level system
predicts the traffic speed down the road with NN models remotely in cloud servers. Road segments
are targeted for prediction according to vehicles’ driving routes. NN is effective to represent complex
nonlinear relationships between different statistics, e.g., the traffic speed relationship between one
target road segment and its neighbors.
Beside traffic speed, individual vehicle speeds can also be affected by other factors,
73
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
including vehicle type, road type, and lane selection/change. Some driving states are unobservable,
i.e., the vehicle’s lane selection/change. A vehicle’s lane selection/change is determined by the
driver’s preference or as the reaction to the real-time traffic events. The vehicle lane detection relies
on additional devices and precise localization techniques [95], which are not available for common
vehicles. To deal with the unobservable states, HMM is selected as a suitable model to establish the
relationship between vehicle speeds and different sates and predict the vehicle speed. An HMM is a
stochastic Markov model where the observation is a probabilistic function of the hidden states [96].
Although hidden states are not directly observable, they have physical meanings in the vehicle speed
prediction application and can be deduced from the observation sequence. A hidden state represents
the joint distribution between the traffic speed and vehicle speed on a road segment in an emission
function. Because different types of vehicles, e.g., sedan and SUV, have different vehicle mobilities,
an HMM is built separately for a specific type of vehicle for accurate modeling and prediction. When
a vehicle enters the road segment k, the vehicle speed data on all previous k − 1 road segments as
well as the traffic speed prediction for all remaining road segments are used by HMMs to predict the
vehicle speed on the remaining road segments.
!!
Result
Traffic Prediction
Data Upload
Cloud
Historical Data
Real-time Data Prediction
Training
NN
Historical Cycle
Real-time Data
Training
Prediction Diving Route
HMM Historical Traffic
Vehicle Speed Prediction
Figure 5.1: Scheme of the 2-level vehicle speed prediction system
5.3 Vehicle Speed Prediction System Design
In this section, we will elaborate on the two-level prediction system design based on NN
and HMM.
74
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
5.3.1 Traffic Speed Prediction with NN
Traffic speeds of a target road segment in future time horizons are predicted based on the
current and historical data of itself and its neighbor road segments. In our work, the traffic speed
prediction starts from 7AM for the morning rush hours with the time resolution (prediction period)
∆t. At time t, traffic speeds of the road segment i in future n periods (n-period ahead) are to be
predicted by the nonlinear function gi(·):
vi(t+ n∆t) = gi
({vi(t), ..., vi(t−mi∆t)
},{fi(t), ...,
fi(t−m′i∆t)},{vnb,i(t), ...,vnb,i(t−mnb,i∆t)
},{
fnb,i(t), ..., fnb,i(t−m′nb,i∆t)})
(5.1)
Prediction input date can be categorized into four groups. The first group{vi(t), ..., vi(t−mi∆t)
}is the target road’s historical traffic speed, where mi is the length of previous data utilized. Similarly,
the second group{fi(t), ..., fi(t −m′i∆t)
}is the historical flow of target road. Here the flow of
the road segment i in time period t is defined as the total number of vehicles entering the road
segment with in that period of time (vehicle count/∆t). The third and fourth group are historical
data of target road’s neighbor roads, where vnb and fnb indicate the vector of traffic speed and flow,
respectively. The size of each data group is constrained to reduce the training complexity. With the
size constraint, the neighbor roads and the length of historical data are selected according to the traffic
data correlations with the target road. The interdependence relation gi(·) is to be learned by NN. NN
is a statistical model with neurons and sets of adaptive weights to approximate non-linear functions
of their input. An NN can be described as a weighted directed graph where artificial neurons are
nodes and weighted directed edges are connections between neuron outputs and neuron inputs [97].
A neuron takes input xi with individual weights wi plus a bias b and apply them to an activation
function f to generate the output y, which is shown in Fig. 5.2. An activation function can be a step,
piecewise linear , sigmoid, or Gaussian function. NNs can be categorized into feed-forward networks
and feedback networks. In feed-forward networks, there are no loops in NN graphs. Different, loops
and feedback connections occur in the feedback networks. The NN model is selected with nonlinear
autoregressive network with exogenous inputs (NARX) and feedback connections, as shown in
Fig. 5.3, because the inputs include dependent signals relating to the target road i and exogenous
(independent) signals of road i’s neighbors. The NN mode is designed with one hidden layer and
30 log-sigmoid neurons. After the NN model is trained with historical traffic data, it takes vnb, fnb,
and vi as input to predict road i’s traffic speed in the future time. When a vehicle is driving along
75
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
the route, its travel time to following roads are first estimated according to the vehicle current speed.
Then the traffic speed prediction result of each road segment at the time upon the vehicle arrival will
be fetched accordingly.
𝑤1
𝑤2
𝑤3
𝑤4
𝑤5
Σ 𝑓 𝑏
𝑥1
𝑥2
𝑥3
𝑥4
𝑥5
𝑦
Neuron
Figure 5.2: Diagram of a NN neuron... ...
Input Layer Hidden Layer Output Layer
Current and
Historical Traffic
Flow and Speed
Future Traffic
Speed Prediction
30 Neurons
Figure 5.3: NARX NN Model for Traffic Speed Prediction
76
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
5.3.2 Vehicle Speed Prediction with HMM
5.3.2.1 HMM Design and Training
The HMM for vehicle speed prediction is designed with a left-to-right structure, as shown
in Fig. 5.4, to represent the driving from the start point to the destination. An HMM is called
left-to-right if and only if the hidden state transitions do not include loops. In the HMM, a stage
represents a road segment. Each circle in Fig. 5.4 represents a hidden state. We denote qk as the
hidden state in stage k, which is in the state set {Si :Nk−1+1 ≤ i ≤ Nk}. Hidden states are traversed
over stages when the vehicle is driving. The observation of the HMM in stage k is denoted as ok. An
HMM can be fully described by three parameters as λ = (A,B, π), where A is the state transition
distribution, B is the emission probability distribution, and π denotes the initial state distribution.
Each stage observation is the realization of the bivariate random variableOk = (Vtf,k, Vvh,k), where
Vtf,k and Vvh,k represent the traffic and vehicle speed in stage k, respectively. After a vehicle passes a
stage k, ok is recorded. The state transition distribution matrix is defined as A = {aij}, where
ai,j = P [qk+1 = Sj |qk = Si], Nk−1 + 1 ≤ i ≤ Nk
Nk + 1 ≤ j ≤ Nk+1 (5.2)
ai,j is the probability of transferring from Si in stage k to Sj in stage k+ 1. Since the speeds in each
observation are continuous values, B = bi(ok) is defined as the conditional joint probability density
of observation random variables given hidden state Si. Gaussian mixture models [96, 98] are used to
construct bi(ok) as:
bi(ok) =M∑m=1
ci,mG(ok,µi,m,Σi,m) (5.3)
where M is the number of Gaussian mixture components. ci,m is the mixture coefficient for the
mth mixture in state i. G(ok,µi,m,Σi,m) is a bivariate Gaussian density with mean vector µi,m and
covariance matrix Σi,m. π = {πi, 1 ≤ i ≤ N1} defines the probability of the initial hidden state in
the first stage.
The HMM should be well trained before prediction, i.e., determining the number of hidden
states in each stage, A, B, and π. The optimal number of hidden states can be selected according
to Akaike’s information criterion (AIC) or Bayesian information criterion (BIC) on a maximum
likelihood basis [99, 100], which are defined as:
AIC = −2 logL+ 2p
BIC = −2 logL+ p log T (5.4)
77
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
…
…
…
…
…
Road 1 Road 2 Road 3 Road K…
Driving Route
1S
2S
1NS
1 1NS
1 2NS
2NS
2 1NS
2 2NS
3NS
1 1KNS
1 2KNS
KNS
1 1( )b ο1 1 2( )Nb ο
2 1 3( )Nb ο1 1( )
KN Kb ο
11, 1Na 1 21, 1N Na
2 1( )b ο1 2 2( )Nb ο
2 2 3( )Nb ο1 2 ( )
KN Kb ο
1 1( )Nb ο2 2( )Nb ο
3 3( )Nb ο ( )KN Kb ο
Figure 5.4: Left-to-right HMM for vehicle speed prediction
where logL is the log-likelihood of the fitted model. p is the number of parameters of the model,
which takes account of the initial probability πi, the transition probability ai,j , as well as cj,m, ok,
µj,m, and Σj,m in (5.3). T is the number of samples used for training. The lower AIC and BIC
values are, the better models are fit. AIC and BIC are both penalized likelihood criteria. Their main
difference is that BIC has a larger penalty term than AIC when the sample size is large, which is true
in many cases. AIC and BIC may risk at selecting a too large or small size of model, respectively.
AIC works better to overcome the underfitting with a small sample set, while BIC is preferred for
a case with a large sample set to prevent overfitting [101]. In this work, both AIC and BIC are
used and their results will be compared. HMM configurations with the smallest AIC or BIC will
be selected for the vehicle speed prediction. An HMM is trained with historical traffic and vehicles
data by Baum-Welch algorithm [102]. Since the number of stages is large, each stage is assumed
with the same number of hidden states Q to reduce the complexity of model selection. The number
of Gaussian mixture components is also set the same for each hidden state. The HMM training
targets on finding the optimal (Q,M) configuration and HMM parameters. The training procedure
includes three steps are: first, (Q,M) configurations are initialized with different values. Second,
for each (Q,M) configuration, the related HMM is trained by the Baum-Welch algorithm and the
AIC and BIC values are calculated. Finally, two configurations with smallest AIC and BIC values,
respectively, are selected and their HMMs are used for vehicle speed prediction and evaluation.
78
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
5.3.2.2 Prediction algorithm
As a vehicle is in stage k∗, observations in all previous k∗ − 1 stages is denoted as a
sequence in the vector ok∗−1 = (o1, o2, ..., ok∗−1). The vehicle speed is predicted as the conditional
expectation of vehicle speed in the stage k(k ≥ k∗), given the vector ok∗−1 and traffic prediction:
vk∗−1
vh,k (vtf,k) = E[Vvh,k|Vtf,k = vtf,k,ok∗−1], k ≥ k∗ (5.5)
where vtf,k is the traffic speed measurement or prediction information for the road segment k at
the time when the vehicle arrives at it. The vehicle’s travel time from stage k∗ to k is estimated
according to the vehicle current speed and traffic speed information. The gap ∆k between prediction
stage k and the last observation stage k∗ − 1 is the prediction ahead step. vk∗−1
vh,k (vtf,k) is calculated
from the conditional joint pdf of the vehicle speed and traffic speed given the observation sequence
f(vvh,k, vtf,k|ok∗−1, λ)(k ≥ k∗). To get this conditional joint pdf, two internal statistics need
to be determined, including the scaled forward probability αk∗−1(i) = P (qk∗−1 = Si|ok∗−1, λ)
and the probabilities of following hidden states given the previous observation sequence P (qk =
Sj |ok∗−1, λ)(k ≥ k∗).
The scaled forward probabilityαk∗−1(i) is calculated by the forward-backward algorithm[103,
104]. For the stage k∗, P (qk∗ = Sj |ok∗−1, λ) is calculated based on state transition probabilities as:
P (qk∗= Sj |ok∗−1, λ) =∑
i∈Ik∗−1
aijP (qk∗−1 = Si|ok∗−1, λ) (5.6)
where Ik∗−1 is the set of state index belonging to stage k∗ − 1.
Similarly, the probabilities of following hidden states P (qk = Sj |ok∗ , λ) can be calculated
for k > k∗. The conditional joint pdf of the vehicle speed and traffic speed given the observation
sequence can be calculated as:
f(vvh,k, vtf,k|ok∗−1, λ)=∑i∈Ik
bi(ok)P (qk = Si|ok∗−1, λ) (5.7)
vvh,k is thus calculated as:
vk∗−1
vh,k (vtf,k) =
∫ vmaxvh
0vvh,kf(vvh,k|vtf,k,ok∗−1, λ)dvvh,k
=
∫ vmaxvh
0vvh,k
f(vvh,k, vtf,k|ok∗−1, λ)
fVtf,k(vtf,k)dvvh,k (5.8)
where vmaxvh is the maximum vehicle speed. fVtf,k(vtf,k) is the marginal density of traffic speed in
stage k. fVtf,k(vtf,k) is approximated as the Gaussian distribution with the traffic speed prediction
result as mean and prediction root mean square error (RMSE) as standard deviation.
79
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
5.4 Road Network and Simulation Setup
The prediction system is evaluated on the Luxembourg motorway network during morning
rush hours between 7AM and 10AM. Luxembourg has the highest densities of motorways in Europe
and is with complex traffic. Floating car data, including real-time speeds and driving routes, are
necessary input for the prediction system. Existing Luxembourg ITS systems, e.g., Ponts et Chaussees
traffic monitoring system [105], only provide volume counts or traffic speed at specific locations.
Thus, to simulate floating car data, microscopic traffic simulation is carried out on SUMO with the
VehiLux vehicular mobility model [106, 107, 108]. VehiLux generates vehicle driving traces based
on real traffic volume counts and Luxembourg GIS map. The Luxembourg road network in SUMO
is shown in Fig. 5.5. The procedure of the simulation data preparation is shown in Fig. 5.6. First,
daily traffic count data are downloaded from the Ponts et Chaussees traffic monitoring system by
a Python script. Second, based on the volume count data, the VehiLux model simulates the traffic
demand and generates vehicle route data with the Dijkstra algorithm followed by augmentation with
the Gawron’s dynamic route assignment algorithm [109]. Finally, vehicle route data are provided to
SUMO for simulation where traffic and vehicle driving data are parsed from the simulation result.
Figure 5.5: Luxembourg road network in SUMO
A part of A3 motorway (south to north direction) with the distance of 6.5km in the
Luxembourg network is selected as the driving route for the prediction evaluation, which is shown
in Fig. 5.7. The route is composed of 12 road segments in the VehiLux model. To detect vehicles’
instant driving speed, senors are placed every 50 meters from the beginning of each road segment.
80
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
Ponts et
Chaussees
Traffic
Count
VehiLux
Model
Vehicle
Traces
SUMO
Simulator Traffic and
Vehicle Data
Download Configuration
Route Assignment
Input Parse
Figure 5.6: Procedure of data preparation for traffic prediction based on simulation
The vehicle speed on a road segment is calculated as the average of speeds captured by all sensors
on the road segment. The traffic speed on a road segment within a time interval is calculated by
averaging speeds of all passing vehicles. Five types of vehicles are considered for the vehicle trace
generation. They are configured with different length, maximum speed, and the Krauss car following
model. We simulate the morning rush hour traffic on every weekday between March and December
in 2010 and store results in XML files. Java programs are developed to parse the vehicle speed, traffic
speed, and traffic flow from the XML files. The MATLAB NN and HMM toolboxes [110, 111] are
used for model training and prediction. Eq. (5.1) is learned by NN for each road segment with the
traffic prediction period ∆t = 3 min. Future traffic speeds of each road segment up to 5-period
ahead are to be predicted. NN models are trained with traffic data from March to November. Data in
December are used for traffic prediction verification. For HMM training, 3000 vehicle traces are
selected randomly between the March and November data set. Another 2000 traces in December are
used for vehicle speed prediction verification.
5.5 Result and Analysis
We first evaluate the traffic speed prediction performance on all road segments in the route.
The one period ahead prediction result of a randomly selected road segment is shown in Fig. 5.8 with
the prediction root mean square error (RMSE) 2.434m/s. RMSEs of all road segments with different
prediction ahead periods are shown in Fig. 5.9. Although RMSEs become larger as the number of
ahead periods increases, they are smaller than 3.5m/s and show satisfying prediction accuracy.
To evaluate vehicle speed prediction performance, we randomly pick one type of vehicle,
i.e., “twingo”. AIC and BIC values are checked for HMMs with different (Q,M) configurations and
shown in Fig. 5.10a and Fig. 5.10b, respectively. (Q=3,M =7) and (Q=2,M =5) are selected
as the optimal configurations because they give the smallest AIC and BIC values, respectively.
81
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
Figure 5.7: Road set for prediction in Luxembourg motorway network
0 50 100 150 200 250 300
20
25
30
35
Time Slot
Tra
ffic
Spe
ed
ObservationPrediction
Figure 5.8: Traffic prediction result for road segment #7 with one prediction period ahead
HMMs with these two configurations are trained by the Baum-Welch algorithm and will be used
for vehicle speed prediction. A trained HMM with (Q = 3,M = 7) for a random selected road
82
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
0 2 4 6 8 10 120
1
2
3
4
Road Index
RM
SE(m
/s)
1-Pread Ahead3-Period Ahead5-Period Ahead
Figure 5.9: Traffic speed prediction RMSE of all road segments
segment is checked. Samples generated by the HMM are compared with the simulation observations
(not used for the HMM training) as shown in Fig. 5.11a and 5.11b. Results show that HMM can
effectively reproduce the relationship between traffic speed and vehicle speed, which is important for
the accurate vehicle speed prediction.
We denote our proposed method as NN/HMM. For performance evaluation, NN/HMM is
compared with another two methods: the traffic speed approximation (TSAP) and the KDE combined
with NN (NN/KDE) similar to the method in [94]. TSAP simply approximates the individual vehicle
speed as the traffic speed predicted by NN. In NN/KDE, the conditional pdf of the vehicle speed
given traffic speed on each road segment is learned by using KDE. The vehicle speed is predicted as
the conditional expectation according to the pdf and the traffic speed prediction. Fig. 5.12 and Fig.
5.13 show RMSEs and mean absolute percentage errors (MAPEs), respectively, of the vehicle speed
prediction with these three methods, where NN/HMM is evaluated with one prediction step ahead
(∆k = 1). Results show that NN/HMM (Q=3, M=7) outperforms the others while the TSAP has
the worst accuracy. Compared with TSAP and NN/KDE, NN/HMM (Q=3, M=7) reduces RMSE
by 45.1% and 18.2% on average, respectively. We also evaluate influence of the prediction ahead
step ∆k on the prediction accuracy. Fig. 5.14 shows that the prediction RMSE increases as∆k
becomes larger. When ∆k < 7, NN/HMM (Q = 3,M = 7)outperforms NN/KDE for most road
segments. Finally, we check the prediction absolute errors of 2000 individual vehicles on a selected
road segment with the histogram and pdf plotted in Fig. 5.15. The 98.7th percentile absolute error is
1m/s which shows satisfying prediction accuracy.
83
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
2 3 4 5 6 7 8 9
Number of Gaussian Components (M)
AIC
●
●
●
●●
● ● ●
1.0
1.1
1.2
1.3
1.4×105
● Q=1Q=2
Q=3Q=4
Q=5Q=6
(a) AIC values with different configurations
2 3 4 5 6 7 8 9
Number of Gaussian Components (M)
BIC
●●
● ● ● ● ● ●
1.0
1.5
2.0
2.5
3.0×105
● Q=1Q=2
Q=3Q=4
Q=5Q=6
(b) BIC values with different configurations
Figure 5.10: AIC and BIC values for HMMs with different (Q, M) configurations
84
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
5 10 15 20 25 30 350
5
10
15
20
25
30
35
Traffic Speed (m/s)
Veh
icle
Spe
ed (m
/s)
(a) Samples generated by HMM
0 5 10 15 20 25 30 350
5
10
15
20
25
30
35
Traffic Speed (m/s)
Veh
icle
Spe
ed (m
/s)
(b) Simulation observations
Figure 5.11: Comparison between HMM sampling and simulation observation for one road segment
0 2 4 6 8 10 120
2
4
6
8
10
Road Index
RM
SE(m
/s)
TSAPNN/KDENN/HMM(Q=3, M=7)NN/HMM(Q=2, M=4)
Figure 5.12: Vehicle speed prediction RMSE of TSAP, NN/KDE and NN/HMM (∆k = 1)
85
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
0 2 4 6 8 10 120
5
10
15
20
25
Road Index
MA
PE(%
)
TSAPNN/KDENN/HMM(Q=3, M=7)NN/HMM(Q=2, M=4)
Figure 5.13: Vehicle speed prediction MAPE of NN, NN/KDE and NN/HMM (∆k = 1)
0 2 4 6 8 10 120.5
1
1.5
2
2.5
3
3.5
Road Index
RM
SE(m
/s)
NN/HMM(k=1)NN/HMM(k=3)NN/HMM(k=5)NN/HMM(k=7)NN/KDE
Figure 5.14: Vehicle speed prediction RMSE of NN/KDE and NN/HMM with different ∆k
86
CHAPTER 5. TRAFFIC AND VEHICLE SPEED PREDICTION IN VEHICULAR NETWORKS
0.0 0.5 1.0 1.5 2.0Prediction Absolute Difference
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Figure 5.15: Histogram and pdf of vehicle speed prediction absolute error for road segment #5
87
Chapter 6
Conclusion and Future Research
This dissertation focuses on the optimization and management system design for cost-
effective and energy-efficient CPS, including the design of an energy ecosystem with hierarchical
optimization, V2G system for reactive power compensation, on-road PHEV power management
system, and vehicle speed prediction in vehicular network. We first propose an energy ecosystem
with DR and DER management. DR considers the updated external information and user’s task
preference in decision making. Preference functions, modeled with KDE, are used to describe user’s
task preference. With two-level shared cost-led management, µCHPs are fully utilized to reduce
the energy consumption cost of the whole community. At last, VRB management with Q-learning
obtains the optimal discharging policy considering the utility price and stochastic elements of wind
power and load demand. Simulation results show the great effectiveness of this management system
on the energy consumption cost reduction.
Because PEVs and PHEVs play significant roles in smart grid CPS, we then study their
V2G applications. In smart city with smart grid, travel information of each PEV agent can be
collected from cellular devices and be used in optimal applications, so as to resolve service conflicts,
augment PEV agent benefits, and enhance the performance of power distribution network. With
on-board bidirectional AC chargers, PEVs are utilized as mobile and distributed VAr resources
for reactive power compensation to the grid. We propose an optimal scheduling scheme of PEV
parking and charging as the responsibility of a PEV aggregator. PEVs are scheduled after their
reservations, i.e., charging requirement and parking preference, are received by the aggregator. The
scheduling problem is formulated as a multi-objective optimization problem considering both PEV
agent benefits and the grid compensation performance. The original non-linear optimization problems
are reformulated to MILP problems for efficient solving. With Lagrangian decomposition and NNC
88
CHAPTER 6. CONCLUSION AND FUTURE RESEARCH
method, Pareto points are solved in a decentralized way and the approach is scalable in terms of the
number of PEVs and charging stations. Simulation results from seven test cases show satisfying
solution quality and the effects of different aspects on PEV benefits and reactive power compensation
performance. The trade-off between the two objectives is analyzed in detail. The study of load flow
analysis demonstrates effectiveness of the proposed V2G system on power loss reduction.
Besides smart grid power management and support, we also study on the economical
operation of PHEVs, i.e., optimal power management. We propose an on-road hierarchical power
management CPS for PHEVs in vehicular networks. Driving cycles are decomposed and modeled
with unit cycles in the spatial domain. The high-level online battery budget generation is formulated
as a MSQP problem and solved with the nested decomposition algorithm. M/G2 torque and ICE
speed policies are generated offline at the low-level. The low-level management is formulated as
a finite-horizon MDP and solved with backward induction method. During driving, the high-level
battery energy budgets are generated in real-time. According to the battery energy budgets, power
management decisions are made by looking up the policy tables. Simulation results for five different
cases show that the proposed 2-level MSQP/MDP method can utilize the vehicle speed prediction
information and adapt to the stochastic real-time driving states so as to minimize the fuel consumption
for the entire trip.
Vehicle speed prediction serves as an important input for our on-road PHEV power
management systems. At last, a novel two-level vehicle speed prediction system for highway network
is proposed based on NN and HMM. According to vehicle driving routes, the traffic speed of target
road segments is first predicted in the first-level by NN with historical traffic data. In the second
level, the statistical relationship between traffic speed and vehicle speed is modeled by HMM on
account of the existence of unobservable states. The proposed method is compared with two other
methods, including traffic speed approximation and KDE method. Results show that our proposed
method outperforms the others in terms of prediction accuracy.
There are several directions for our future research. First, our existing system models
can be further enhanced according to real system characteristics. For instance, in our V2G reactive
power compensation system, the effect of random PEV parking and charging (without reservation)
on existing scheduling should be considered. By introducing this stochastic element, the scheduling
system should be augmented with algorithms dealing with possible conflicts and adjusting the
scheduling online. Second, our PHEV power management work and the vehicle speed prediction
work should be integrated and evaluated as a whole since the latter servers as the input for the former.
Constrained by limited on-board computation resources, the integration requires faster algorithms so
89
CHAPTER 6. CONCLUSION AND FUTURE RESEARCH
that the on-road computation requirement can be achieved. Finally, our works are currently evaluated
and verified based on simulations. System evaluation based on real data from power and vehicle
industry is another important future work.
90
Bibliography
[1] M. Albadi and E. El-Saadany, “Demand response in electricity markets: An overview,” in
Proc. IEEE Power Engineering Society General Meeting, June 2007.
[2] California Center for Sustainable Energy, Plug-in Electric Vehicles (PEVs), Available:
http://energycenter.org/index.php/technical-assistance/transportation/electric-vehicles.
[3] B. Kramer, S. Chakraborty, and B. Kroposki, “A review of plug-in vehicles and vehicle-to-grid
capability,” in Proc. IEEE Annual Conf. on Industrial Electronics, Nov. 2008, pp. 2278–2283.
[4] T. Markel, M. Kuss, and M. Simpson, “Value of plug-in vehicle grid support operation,” in
Proc. IEEE Innovative Technologies for an Efficient & Reliable Electricity Supply, Sept. 2010,
pp. 325–332.
[5] P. Vovos, A. Kiprakis, A. Wallace, and G. Harrison, “Centralized and distributed voltage
control: Impact on distributed generation penetration,” IEEE Trans. Power Systems, vol. 22,
no. 1, pp. 476–483, Feb. 2007.
[6] S. Bolognani and S. Zampieri, “A distributed control strategy for reactive power compensation
in smart microgrids,” ArXiv e-prints, 2011.
[7] S.-Y. Lee, C.-J. Wu, and W.-N. Chang, “A compact control algorithm for reactive
power compensation and load balancing with static Var compensator,” Electric
Power Systems Research, vol. 58, no. 2, pp. 63–70, 2001. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S0378779601001274
[8] M. Kisacikoglu, B. Ozpineci, and L. Tolbert, “Examination of a PHEV bidirectional charger
system for V2G reactive power compensation,” in in Proc. IEEE Applied Power Electronics
Conference and Exposition, Feb. 2010, pp. 458–465.
91
BIBLIOGRAPHY
[9] A. Messac, A. Ismail-Yahaya, and C. Mattson, “The normalized normal constraint method
for generating the pareto frontier,” Structural and Multidisciplinary Optimization, vol. 25, pp.
86–98, 2003.
[10] P. Luh, L. Michel, P. Friedland, C. Guan, and Y. Wang, “Load forecasting and demand
response,” in Proc. IEEE Power and Energy Society General Meeting, July 2010.
[11] M. Miller, K. Griendling, and D. Mavris, “Exploring human factors effects in the smart grid
system of systems demand response,” in Proc. Int. Conf. System of Systems Engineering, 2012,
pp. 1–6.
[12] D. Livengood and R. Larsen, “The energy box: Locally automated optimal control of residen-
tial electricity usage,” Service Science, vol. 1, no. 1, pp. 1–16, 2009.
[13] D. O’Neill, M. Levorato, A. Goldsmith, and U. Mitra, “Residential demand response using
reinforcement learning,” in Proc. IEEE Int. Conf. Smart Grid Communications, Oct. 2010, pp.
409–414.
[14] S. Ramchurn, P. Vytelingum, A. Rogers, and N. Jennings, “Agent-based control for decen-
tralised demand side management in the smart grid,” in Proc. Int. Conf. Autonomous Agents
and Multiagent Systems, Feb. 2011, pp. 5–12.
[15] W. Gu, Z. Wu, and X. Yuan, “Microgrid economic optimal operation of the combined heat
and power system with renewable energy,” in Proc. IEEE Power and Energy Society General
Meeting, July 2010, pp. 1–6.
[16] A. Rogers, S. Maleki, S. Ghosh, and N. R. Jennings, “Adaptive home heating control through
gaussian process prediction and mathematical programming,” in Int. Workshop Agent Technol-
ogy for Energy Systems, May 2011, pp. 71–78.
[17] X. Guan, Z. Xu, and Q.-S. Jia, “Energy-efficient buildings facilitated by microgrid,” IEEE
Trans. Smart Grid, vol. 1, no. 3, pp. 243–252, Dec. 2010.
[18] Y. Zhang, N. Gatsis, and G. Giannakis, “Robust energy management for microgrids with
high-penetration renewables,” IEEE Trans. Sustainable Energy, vol. PP, no. 99, pp. 1–10,
2013.
[19] A. Peacock and M. Newborough, “Impact of micro-CHP systems on domestic sector CO2
emissions,” Applied Thermal Engineering, vol. 25, no. 17-18, pp. 2653–2676, 2005.
92
BIBLIOGRAPHY
[20] A. Hawkes and M. Leach, “Cost-effective operating strategy for residential micro-combined
heat and power,” Energy, vol. 32, no. 5, pp. 71–723, 2007.
[21] M. Houwing, R. Negenborn, and B. De Schutter, “Demand response with micro-chp systems,”
Proc. IEEE, vol. 99, no. 1, pp. 200–213, Jan. 2011.
[22] U.S. Environmental Protection Agency, New York Net-Metering Rules , Available:
http://www.epa.gov/chp/policies/policies/ nenewyorknetmeteringrules.html.
[23] L. Barote, R. Weissbach, R. Teodorescu, C. Marinescu, and M. Cirstea, “Stand-alone wind
system with vanadium redox battery energy storage,” in Proc. Int. Conf. Optimization of
Electrical and Electronic Equipment, May 2008, pp. 407–412.
[24] W. Wang, B. Ge, D. Bi, and D. Sun, “Grid-connected wind farm power control using VRB-
based energy storage system,” in Proc. IEEE Energy Conversion Congress and Exposition,
Sept. 2010, pp. 3772–3777.
[25] J. Chahwan, C. Abbey, and G. Joos, “VRB modelling for the study of output terminal voltages,
internal losses and performance,” in Proc. IEEE Electrical Power Conference, Oct. 2007, pp.
387–392.
[26] B. Silverman, Density Estimation for Statistics and Data Analysis. New York: Chapman and
Hall, 1986.
[27] F. Zenith and S. Skogestad, “Control of fuel cell power output,” Journal of Process Control,
vol. 17, no. 4, pp. 333–347, 2007.
[28] P. Zhao, H. Zhang, H. Zhou, J. Chen, S. Gao, and B. Yi, “Characteristics and performance of
10kw class all-vanadium redox-flow battery stack,” Journal of Power Sources, vol. 162, no. 2,
pp. 1416 – 1420, 2006.
[29] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. IEEE Int. Conf. Neural
Networks, vol. 4, Nov 1995, pp. 1942–1948.
[30] R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in Proc. Int.
Symposium Micro Machine and Human Science, Oct 1995, pp. 39–43.
[31] K. E. Parsopoulos and M. N. Vrahatis, Particle Swarm Optimization and Intelligence: Ad-
vances and Applications, 2010.
93
BIBLIOGRAPHY
[32] X. Hu and R. Eberhart, “Solving constrained nonlinear optimization problems with parti-
cle swarm optimization,” in Proc. World Multiconference on Systemics, Cybernetics and
Informatics, 2002, pp. 203–206.
[33] E. Laskari, K. Parsopoulos, and M. Vrahatis, “Particle swarm optimization for integer pro-
gramming,” in Proc. Evolutionary Computation, vol. 2, 2002, pp. 1582–1587.
[34] Y. del Valle, G. Venayagamoorthy, S. Mohagheghi, J.-C. Hernandez, and R. Harley, “Particle
swarm optimization: Basic concepts, variants and applications in power systems,” IEEE Trans.
Evolutionary Computation, vol. 12, no. 2, pp. 171–195, April 2008.
[35] Y. Yare, G. Venayagamoorthy, and U. Aliyu, “Optimal generator maintenance scheduling
using a modified discrete PSO,” IET Journal Generation, Transmission and Distribution,,
vol. 2, no. 6, pp. 834 –846, Nov 2008.
[36] J.-M. Yang, Y.-P. Chen, J.-T. Horng, and C.-Y. Kao, “Applying family competition to evolution
strategies for constrained optimization,” in Evolutionary Programming VI, ser. Lecture Notes
in Computer Science, P. Angeline, R. Reynolds, J. McDonnell, and R. Eberhart, Eds. Springer
Berlin / Heidelberg, 1997, vol. 1213, pp. 201–211.
[37] J. Kennedy and R. Eberhart, “A discrete binary version of the particle swarm algorithm,” in
Proc. IEEE Int. Conf. Systems, Man, and Cybernetics, vol. 5, Oct 1997, pp. 4104–4108.
[38] C. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, 1992.
[39] A. Faruqui and S. Sergici, “Household response to dynamic pricing of electricity: a survey of
15 experiments,” J. of Regulatory Economics, vol. 38, no. 2, pp. 193–225, 2010.
[40] Jacobs 31-20 (20kw) Complete System Pricing, Available: http://www.windturbine.net/
documents/Pricing/wtic retail pricing 2012.01.pdf.
[41] Operational and Maintenance Costs for Wind Turbines, Available: http://www.
windmeasurementinternational.com/wind-turbines/om-turbines.php.
[42] VRB Flow Battery Demonstration, Available: http://apps1.eere.energy.gov/tribalenergy/pdfs/
wind akres04.pdf.
[43] M. Yilmaz and P. Krein, “Review of benefits and challenges of vehicle-to-grid technology,” in
Proc. IEEE Energy Conversion Congress and Exposition (ECCE), 2012, pp. 3082–3089.
94
BIBLIOGRAPHY
[44] E. Sortomme and M. El-Sharkawi, “Optimal scheduling of vehicle-to-grid energy and ancillary
services,” IEEE Trans. Smart Grid, vol. 3, no. 1, pp. 351–359, 2012.
[45] A. Lam, K.-C. Leung, and V. Li, “Capacity management of vehicle-to-grid system for power
regulation services,” in Proc. IEEE Int. Conf. on Smart Grid Communications, 2012, pp.
442–447.
[46] K. Shimizu, T. Masuta, Y. Ota, and A. Yokoyama, “Load frequency control in power system
using vehicle-to-grid system considering the customer convenience of electric vehicles,” in
Proc. Int. Conf. on Power System Technology, 2010, pp. 1–8.
[47] M. Kisacikoglu, B. Ozpineci, and L. Tolbert, “Effects of V2G reactive power compensation
on the component selection in an ev or phev bidirectional charger,” in Proc. IEEE Energy
Conversion Congress and Exposition, 2010, pp. 870–876.
[48] C. Wu, H. Mohsenian-Rad, and J. Huang, “PEV-based reactive power compensation for wind
DG units: A stackelberg game approach,” in Proc. IEEE Int. Conf. Smart Grid Communica-
tions, 2012, pp. 504–509.
[49] C. Wu, H. Mohsenian-Rad, J. Huang, and J. Jatskevich, “PEV-based combined frequency and
voltage regulation for smart grid,” in Proc. IEEE PES Innovative Smart Grid Technologies,
2012, pp. 1–6.
[50] F. Glover, “Improved linear integer programming formulations of nonlinear integer problems,”
Management Science, vol. 22, no. 4, pp. 455–460, 1975.
[51] A. Balakrishnan and S. C. Graves, “A composite algorithm for a concave-cost network
flow problem,” Networks, vol. 19, no. 2, pp. 175–202, 1989. [Online]. Available:
http://dx.doi.org/10.1002/net.3230190202
[52] D. Yue, G. Guilln-Goslbez, and F. You, “Global optimization of large-scale mixed-integer
linear fractional programming problems: A reformulation-linearization method and process
scheduling applications,” AIChE Journal, vol. 59, no. 11, pp. 4255–4272, 2013. [Online].
Available: http://dx.doi.org/10.1002/aic.14185
[53] H. D. Sherali and O. Ulular, “A primal-dual conjugate subgradient algorithm for specially
structured linear and convex programming problems,” Applied Mathematics and Optimization,
vol. 20, pp. 193–221, 1989. [Online]. Available: http://dx.doi.org/10.1007/BF01447654
95
BIBLIOGRAPHY
[54] E. Kutanoglu and S. Wu, “On combinatorial auction and lagrangean relaxation for distributed
resource scheduling,” IIE Trans., vol. 31, no. 9, pp. 813–826, 1999.
[55] TOMLAB Optimization in MATLAB, Available: http://tomopt.com/tomlab/.
[56] U.S. Environmental Protection Agency, Sector Collaborative on
Energy Efficiency Accomplishments and Next Steps, Available:
http://www.epa.gov/cleanenergy/documents/suca/sector collaborative.pdf.
[57] Norman Disney & Young, Power Factor Correction Evaluation, Available:
http://www.abcb.gov.au.
[58] S. Lin, D. Salles, W. Freitas, and W. Xu, “An intelligent control strategy for power factor
compensation on distorted low voltage power systems,” Smart Grid, IEEE Transactions on,
vol. 3, no. 3, pp. 1562–1570, Sept 2012.
[59] Power factor correction for buildings: Power quality improved, Avail-
able: http://www.epcos.com/epcos-en/373562/tech-library/articles/applications—
cases/applications—cases/power-quality-improved/171824.
[60] W. H. Kersting, Distribution System Modeling and Analysis. CRC Press, Jan. 2012.
[61] B. Mashadi and S. Emadi, “Dual-mode power-split transmission for hybrid electric vehicles,”
IEEE Trans. Vehicular Technology, vol. 59, no. 7, pp. 3223–3232, 2010.
[62] Y. Li and N. Kar, “Advanced design approach of power split device of plug-in hybrid electric
vehicles using dynamic programming,” in Proc. of IEEE Vehicle Power and Propulsion
Conference, 2011, pp. 1–6.
[63] Y. Li and N. C. Kar, “Investigating the effects of power split PHEV transmission gear ratio to
operation cost,” in Proc. of IEEE Transportation Electrification Conference and Expo, 2012,
pp. 1–6.
[64] S. Moura, H. Fathy, D. Callaway, and J. Stein, “A stochastic optimal control approach for power
management in plug-in hybrid electric vehicles,” IEEE Trans. Control Systems Technology,
vol. 19, no. 3, pp. 545–555, 2011.
96
BIBLIOGRAPHY
[65] Q. Gong, Y. Li, and Z.-R. Peng, “Optimal power management of plug-in HEV with intel-
ligent transportation system,” in Proc. of IEEE/ASME Int. Conf. on Advanced Intelligent
Mechatronics, Sept. 2007.
[66] Y. Bin, Y. Li, Q. Gong, and Z.-R. Peng, “Multi-information integrated trip specific opti-
mal power management for plug-in hybrid electric vehicles,” in Proc. of American Control
Conference, June 2009.
[67] M. Zhang, Y. Yang, and C. Mi, “Analytical approach for the power management of blended-
mode plug-in hybrid electric vehicles,” IEEE Trans. Vehicular Technology, vol. 61, no. 4, pp.
1554–1566, May 2012.
[68] N. Schouten, M. Salman, and N. Kheir, “Fuzzy logic control for parallel hybrid vehicles,”
IEEE Trans. Control Systems Technology, vol. 10, no. 3, pp. 460–468, 2002.
[69] J. Gao, F. Sun, H. He, G. Zhu, and E. Strangas, “A comparative study of supervisory control
strategies for a series hybrid electric vehicle,” in Proc. of Asia-Pacific Power and Energy
Engineering Conference, 2009, pp. 1–7.
[70] M. Koot, J. T. B. A. Kessels, B. de Jager, W. P. M. H. Heemels, P. P. J. Van den Bosch, and
M. Steinbuch, “Energy management strategies for vehicular electric power systems,” IEEE
Trans. Vehicular Technology, vol. 54, no. 3, pp. 771–782, 2005.
[71] S. Adhikari, S. Halgamuge, and H. Watson, “An online power-balancing strategy for a parallel
hybrid electric vehicle assisted by an integrated starter generator,” IEEE Trans. Vehicular
Technology, vol. 59, no. 6, pp. 2689–2699, July 2010.
[72] Y. Qi and S. Ishak, “Stochastic approach for short-term freeway traffic prediction during peak
periods,” IEEE Trans. on Intelligent Transportation Systems, vol. 14, no. 2, pp. 660–672,
2013.
[73] S. Ishak, C. Mamidala, and Y. Qi, “Stochastic characteristics of freeway traffic speed during
breakdown and recovery periods,” J. Transportation Research Board, vol. 2178, pp. 79–89,
2010.
[74] X. Zhang and C. Mi, Vehicle Power Management: Modeling, Control and Optimization.
Springer, 2011.
97
BIBLIOGRAPHY
[75] F. V. Louveaux, “A solution method for multistage stochastic programs with recourse with
application to an energy investment problem,” Operations Research, vol. 28, no. 4, pp. 889–
902, 1980.
[76] J. R. Birge and F. Louveaux, Introduction to Stochastic Programming. Springer, 2011.
[77] T. W. Anderson and L. A. Goodman, “Statistical inference about markov chains,” The Annals
of Mathematical Statistics, vol. 28, no. 1, pp. 89–110, 1957.
[78] N. Bauerle and U. Rieder, Markov Decision Processes with Applications to Finance. Springer,
2011.
[79] ADVISOR Software for Advanced Vehicle Energy Analysis, Available:
http://bigladdersoftware.com/advisor/.
[80] P. Dey, S. Chandra, and S. Gangopadhaya, “Speed distribution curves under mixed traffic
conditions,” J. Transp. Eng., vol. 132, no. 6, pp. 475–481, 2006.
[81] D. Berry and D. Belmont, “Distribution of vehicle speeds and travel times,” 1951, pp. 589–602.
[82] K. Ahn and P. Y. Papalambros, “Engine optimal operation lines for power-split hybrid electric
vehicles,” The Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering,
vol. 223, pp. 1149–1162, 2009.
[83] A. May, Traffic flow fundamentals. Prentice-Hall, Englewood Cliffs, New Jersey, 1990.
[84] A. Messmer and M. Papageorgiou, “Metanet: A macroscopic simulation program for motor-
way networks,” Traffic Eng. and Control, vol. 31, no. 9, pp. 466–470, 1990.
[85] I. Prigogine and F. C. Andrews, “A boltzmann-like approach for traffic flow,” Operations
Research, vol. 8, no. 6, pp. 789–797, 1960.
[86] I. Prigogine and R. C. Herman, Kinetic Theory of Vehicular Traffic. Prentice-Hall, Englewood
Cliffs, New Jersey, 1990.
[87] T. Szczuraszek and R. Krystek, “A macroscopic model for traffic speed prediction on two-
lane roads,” Transportation Systems: Theory and Application of Advanced Technology, pp.
185–190, 1995.
98
BIBLIOGRAPHY
[88] B. L. Smith and M. J. Demetsky, “Traffic flow forecasting: comparison of modeling ap-
proaches,” J. of Transportation Engineering, vol. 123, no. 4, pp. 261–266, 1997.
[89] C. Chen, J. Hu, Q. Meng, and Y. Zhang, “Short-time traffic flow prediction with arima-garch
model,” in Proc. IEEE Symp. on Intelligent Vehicles, June 2011.
[90] B. Zhang, K. Xing, X. Cheng, L. Huang, and R. Bie, “Traffic clustering and online traffic
prediction in vehicle networks: A social influence perspective,” in Proc. of IEEE INFOCOM,
2012, pp. 495–503.
[91] V. Hodge, R. Krishnan, T. Jackson, J. Austin, and J. Polak, “Short term traffic prediction
using a binary neural network,” in Proc. Annu. Universities’ Transport Study Group Conf.,
Jan. 2011.
[92] J. Park, D. Li, Y. Murphey, J. Kristinsson, R. McGee, M. Kuang, and T. Phillips, “Real time
vehicle speed prediction using a neural network traffic model,” in Proc. of Int. Joint Conf. on
Neural Networks, July 2011, pp. 2991–2996.
[93] S. Rogers and W. Zhang, “Development and evaluation of a curve rollover warning system for
trucks,” in Proc. IEEE Intelligent Vehicles Symp., June 2003, pp. 294–297.
[94] J. McNew, “Predicting cruising speed through data-driven driver modeling,” in Proc. IEEE
Conf. on Intelligent Transportation Systems, Sept. 2012, pp. 1789–1796.
[95] D. Li, T. Bansal, Z. Lu, and P. Sinha, “Marvel: Multiple antenna based relative vehicle
localizer,” in Proc. Int. Conf. on Mobile Computing and Networking, 2012, pp. 245–256.
[96] L. Rabiner, “A tutorial on hidden markov models and selected applications in speech recogni-
tion,” in Proc. the IEEE, vol. 77, no. 2, Feb 1989, pp. 257–286.
[97] A. Jain, J. Mao, and K. Mohiuddin, “Artificial neural networks: a tutorial,” Computer, vol. 29,
no. 3, pp. 31–44, Mar. 1996.
[98] J. A. Bilmes et al., “A gentle tutorial of the em algorithm and its application to parameter
estimation for gaussian mixture and hidden markov models,” Int. Computer Science Institute,
vol. 4, no. 510, p. 126, 1998.
[99] H. Akaike, “A new look at the statistical model identification,” IEEE Trans. Automatic Control,
vol. 19, no. 6, pp. 716–723, Dec 1974.
99
BIBLIOGRAPHY
[100] W. Zucchini and I. L. MacDonald, Hidden Markov Models for Time Series: An Introduction
Using R. Chapman and Hall/CRC, 2009.
[101] J. J. Dziak, D. L. Coffman, S. T. Lanza, and R. Li, “Sensitivity and specificity of information
criteria,” College of Health and Human Development, The Pennsylvania State University,
Tech. Rep. 12-119, June 2012.
[102] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique occurring in the
statistical analysis of probabilistic functions of markov chains,” The Annals of Mathematical
Statistics, vol. 41, no. 1, pp. pp. 164–171, 1970.
[103] L. E. Baum, J. Eagon et al., “An inequality with applications to statistical estimation for
probabilistic functions of markov processes and to a model for ecology,” Bull. Amer. Math.
Soc, vol. 73, no. 3, pp. 360–363, 1967.
[104] L. E. Baum and G. Sell, “Growth functions for transformations on manifolds,” Pacific J. Math,
vol. 27, no. 2, pp. 211–227, 1968.
[105] Ponts et Chaussees Traffic Count, Available: http://www.pch.public.lu/trafic/comptage/index.html.
[106] A. Grzybek, G. Danoy, and P. Bouvry, “Generation of realistic traces for vehicular mobil-
ity simulations,” in Proc. ACM Int. Symp. on Design and Analysis of Intelligent Vehicular
Networks and Applications, 2012, pp. 131–138.
[107] VehiLux: Realistic Vehicular Traces for VANETS Simulator, Available:
http://vehilux.gforge.uni.lu/index.html.
[108] SUMO: Simulation of Urban MObility, Available: http://sumo.dlr.de/wiki/Main Page.
[109] C. Gawron, “An iterative algorithm to determine the dynamic user equilibrium in a traffic
simulation model,” Int. J. of Modern Physics C, vol. 9, no. 3, pp. 393–407, Dec 1998.
[110] Neural Network Toolbox, Available: http://www.mathworks.com/products/neural-network/.
[111] Hidden Markov Model (HMM) Toolbox for Matlab, Available: http://www.cs.ubc.ca/ mur-
phyk/Software/HMM/hmm.html.
100