ieee transactions on power systems 1 a data-driven …wzy.ece.iastate.edu/cv/c2.pdf · ieee proof 2...

12
IEEE Proof IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven Game-Theoretic Approach for Behind-the-Meter PV Generation Disaggregation 1 2 Fankun Bu , Student Member, IEEE, Kaveh Dehghanpour , Member, IEEE, Yuxuan Yuan, Student Member, IEEE, Zhaoyu Wang , Member, IEEE, and Yingchen Zhang , Senior Member, IEEE 3 4 Abstract—Rooftop solar photovoltaic (PV) power generator is a 5 widely used distributed energy resource (DER) in distribution sys- 6 tems. Currently, the majority of PVs are installed behind-the-meter 7 (BTM), where only customers’ net demand is recorded by smart 8 meters. Disaggregating BTM PV generation from net demand is 9 critical to utilities for enhancing grid-edge observability. In this 10 paper, a data-driven approach is proposed for BTM PV genera- 11 tion disaggregation using solar and demand exemplars. First, a 12 data clustering procedure is developed to construct a library of 13 candidate load/solar exemplars. To handle the volatility of BTM 14 resources, a novel game-theoretic learning process is proposed to 15 adaptively generate optimal composite exemplars using the con- 16 structed library of candidate exemplars, through repeated eval- 17 uation of disaggregation residuals. Finally, the composite native 18 demand and solar exemplars are employed to disaggregate solar 19 generation from net demand using a semi-supervised source sepa- 20 rator. The proposed methodology has been verified using real smart 21 meter data and feeder models. Q1 22 Index Terms—Rooftop solar photovoltaic, distribution system, 23 source disaggregation, game theory. 24 I. INTRODUCTION 25 I N PRACTICE, the majority of residential rooftop PVs are 26 installed behind-the-meter (BTM), where only the net de- 27 mand is recorded, which equals native demand minus the solar 28 power generation. Therefore, PV generation is usually invisible 29 to distribution system operators. This invisibility, along with the 30 stochastic nature of solar power, can cause new problems for 31 utilities, such as inaccurate load forecasting and estimation [1], 32 [2], inefficient service restoration [3], [4], and sub-optimal net- 33 work expansion decisions [5]–[7]. Thus, it is of significance to 34 disaggregate PV generation from net demand to enhance grid- 35 edge observability. One solution to this problem is to monitor 36 each single rooftop PV generation by installing extra metering 37 Manuscript received July 15, 2019; revised November 23, 2019; accepted January 11, 2020. This work was supported by Advanced Grid Modeling Program at the U.S. Department of Energy Office of Electricity under Grant DE- OE0000875. Paper no. TPWRS-01023-2019. (Corresponding author: Zhaoyu Wang.) F. Bu, K. Dehghanpour, Y. Yuan, and Z. Wang are with the Department of Electrical and Computer Engineering, Iowa State University, Ames, IA 50011 USA (e-mail: [email protected]; [email protected]; [email protected]; [email protected]). Y. Zhang is with the National Renewable Energy Laboratory, Golden, CO 80401 USA (e-mail: [email protected]). Color versions of one or more of the figures in this article are available online at https://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TPWRS.2020.2966732 devices. However, due to the large number of distributed PVs, 38 this option comes at a significant cost for utilities. 39 To avoid costly metering infrastructure expansion, two cat- 40 egories of approaches have been proposed in the literature to 41 disaggregate PV generation from net demand: 42 Category I. Model-based methods: Parametric models along 43 with weather information have been used to estimate solar 44 generation [8]. In [1], a virtual equivalent PV station model is 45 utilized to represent the total generation of BTM PVs in a region, 46 where model parameters are obtained by solving an optimization 47 problem. In [9], the clear sky generation model is combined with 48 a physical PV panel model to estimate solar generation. This 49 model-based framework requires meteorological data, precise 50 geographic information, and accurate physical characteristics 51 of PV arrays. The major shortcoming of model-based solutions 52 for solar disaggregation is the unavailability and uncertainty of 53 model parameter information [10], which is further complicated 54 by limited access to unauthorized BTM installations [5]. More- 55 over, model-based solutions are subject to gross overestimation 56 of solar generation in case of BTM PV failure [11]. 57 Category II. Data-driven methods: As the advanced metering 58 infrastructure (AMI) has been widely deployed in distribution 59 systems in recent years, utilities have gained access to large 60 amounts of smart meter data [12], [13]. To mine the hid- 61 den information contained within various data sources with 62 both sufficient temporal and spatial granularity, data-driven 63 approaches have been proposed by researchers for different 64 applications, such as energy disaggregation [14], load fore- 65 casting [1], load management [15] and fault detection [11]. 66 In particular, learning-based approaches have drawn significant 67 attention among both researchers and industry practitioners. 68 Measurement data from various sources, including smart me- 69 ters, supervisory control and data infrastructure (SCADA), and 70 micro-phasor measurement units (μPMU), have been utilized 71 to perform solar disaggregation from net demand. In [16], a 72 data-driven approach is proposed based on dimension reduction 73 and mapping functions using PV generation measurement data 74 from temporarily-installed sensors. In [17], a linear proxy-based 75 estimator is developed to disaggregate a solar farm generation 76 from feeder-level measurement using μPMU data, along with 77 the measured power profile of nearby PV plants, and global 78 horizontal irradiance (GHI) proxy data. In [18], a PV generation 79 disaggregation approach is presented for groups of residential 80 customers, under the assumption that their aggregate active 81 power is measured at the point of common coupling (PCC) 82 0885-8950 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Upload: others

Post on 08-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

IEEE TRANSACTIONS ON POWER SYSTEMS 1

A Data-Driven Game-Theoretic Approach forBehind-the-Meter PV Generation Disaggregation

1

2

Fankun Bu , Student Member, IEEE, Kaveh Dehghanpour , Member, IEEE, Yuxuan Yuan, Student Member, IEEE,Zhaoyu Wang , Member, IEEE, and Yingchen Zhang , Senior Member, IEEE

3

4

Abstract—Rooftop solar photovoltaic (PV) power generator is a5widely used distributed energy resource (DER) in distribution sys-6tems. Currently, the majority of PVs are installed behind-the-meter7(BTM), where only customers’ net demand is recorded by smart8meters. Disaggregating BTM PV generation from net demand is9critical to utilities for enhancing grid-edge observability. In this10paper, a data-driven approach is proposed for BTM PV genera-11tion disaggregation using solar and demand exemplars. First, a12data clustering procedure is developed to construct a library of13candidate load/solar exemplars. To handle the volatility of BTM14resources, a novel game-theoretic learning process is proposed to15adaptively generate optimal composite exemplars using the con-16structed library of candidate exemplars, through repeated eval-17uation of disaggregation residuals. Finally, the composite native18demand and solar exemplars are employed to disaggregate solar19generation from net demand using a semi-supervised source sepa-20rator. The proposed methodology has been verified using real smart21meter data and feeder models.

Q1

22

Index Terms—Rooftop solar photovoltaic, distribution system,23source disaggregation, game theory.24

I. INTRODUCTION25

IN PRACTICE, the majority of residential rooftop PVs are26

installed behind-the-meter (BTM), where only the net de-27

mand is recorded, which equals native demand minus the solar28

power generation. Therefore, PV generation is usually invisible29

to distribution system operators. This invisibility, along with the30

stochastic nature of solar power, can cause new problems for31

utilities, such as inaccurate load forecasting and estimation [1],32

[2], inefficient service restoration [3], [4], and sub-optimal net-33

work expansion decisions [5]–[7]. Thus, it is of significance to34

disaggregate PV generation from net demand to enhance grid-35

edge observability. One solution to this problem is to monitor36

each single rooftop PV generation by installing extra metering37

Manuscript received July 15, 2019; revised November 23, 2019; acceptedJanuary 11, 2020. This work was supported by Advanced Grid ModelingProgram at the U.S. Department of Energy Office of Electricity under Grant DE-OE0000875. Paper no. TPWRS-01023-2019. (Corresponding author: ZhaoyuWang.)

F. Bu, K. Dehghanpour, Y. Yuan, and Z. Wang are with the Department ofElectrical and Computer Engineering, Iowa State University, Ames, IA 50011USA (e-mail: [email protected]; [email protected]; [email protected];[email protected]).

Y. Zhang is with the National Renewable Energy Laboratory, Golden, CO80401 USA (e-mail: [email protected]).

Color versions of one or more of the figures in this article are available onlineat https://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TPWRS.2020.2966732

devices. However, due to the large number of distributed PVs, 38

this option comes at a significant cost for utilities. 39

To avoid costly metering infrastructure expansion, two cat- 40

egories of approaches have been proposed in the literature to 41

disaggregate PV generation from net demand: 42

Category I. Model-based methods: Parametric models along 43

with weather information have been used to estimate solar 44

generation [8]. In [1], a virtual equivalent PV station model is 45

utilized to represent the total generation of BTM PVs in a region, 46

where model parameters are obtained by solving an optimization 47

problem. In [9], the clear sky generation model is combined with 48

a physical PV panel model to estimate solar generation. This 49

model-based framework requires meteorological data, precise 50

geographic information, and accurate physical characteristics 51

of PV arrays. The major shortcoming of model-based solutions 52

for solar disaggregation is the unavailability and uncertainty of 53

model parameter information [10], which is further complicated 54

by limited access to unauthorized BTM installations [5]. More- 55

over, model-based solutions are subject to gross overestimation 56

of solar generation in case of BTM PV failure [11]. 57

Category II. Data-driven methods: As the advanced metering 58

infrastructure (AMI) has been widely deployed in distribution 59

systems in recent years, utilities have gained access to large 60

amounts of smart meter data [12], [13]. To mine the hid- 61

den information contained within various data sources with 62

both sufficient temporal and spatial granularity, data-driven 63

approaches have been proposed by researchers for different 64

applications, such as energy disaggregation [14], load fore- 65

casting [1], load management [15] and fault detection [11]. 66

In particular, learning-based approaches have drawn significant 67

attention among both researchers and industry practitioners. 68

Measurement data from various sources, including smart me- 69

ters, supervisory control and data infrastructure (SCADA), and 70

micro-phasor measurement units (μPMU), have been utilized 71

to perform solar disaggregation from net demand. In [16], a 72

data-driven approach is proposed based on dimension reduction 73

and mapping functions using PV generation measurement data 74

from temporarily-installed sensors. In [17], a linear proxy-based 75

estimator is developed to disaggregate a solar farm generation 76

from feeder-level measurement using μPMU data, along with 77

the measured power profile of nearby PV plants, and global 78

horizontal irradiance (GHI) proxy data. In [18], a PV generation 79

disaggregation approach is presented for groups of residential 80

customers, under the assumption that their aggregate active 81

power is measured at the point of common coupling (PCC) 82

0885-8950 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.

Page 2: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

2 IEEE TRANSACTIONS ON POWER SYSTEMS

to the grid. In [19], a non-intrusive load monitoring (NILM)83

approach is proposed to disaggregate PV generation from net84

demand using measurements with 1-second resolution. In [10],85

the capacity of residential rooftop PVs are estimated using86

customers’ net load curve features. In addition, several data-87

driven methods have been applied for disaggregating house-level88

demand into appliance-level energy [14], [20], [21]. However,89

these energy disaggregation approaches also require input data90

with high temporal resolution in the range of a few seconds.91

Datasets with this level of granularity are not generally available92

to utilities, which applying these approaches difficult for solving93

solar disaggregation problem. To sum up, the chief limitations94

of previous data-driven methods are: dependence on installation95

of costly μPMU devices and high-resolution sensors throughout96

the network, availability of massive PV generation data, vulnera-97

bility to customer behavior volatility, and the inability to estimate98

all relevant parameters of BTM PV generators, including panel99

orientations, which impact the time-series generation profile.100

Considering the drawbacks of previous methods and the101

emerging of smart meter data source, in this paper, a novel game-102

theoretic data-driven approach is proposed for disaggregating103

PV generation using only smart meter data. The proposed ap-104

proach exploits the observed correlations within real utility data.105

The basic idea is to use the native demand and PV generation of106

fully observable customers to disaggregate the native demand107

and PV generation of customers with only known net demand.108

Accordingly, a spectral clustering (SC) algorithm is employed109

to construct solar and native demand candidate exemplars using110

the data from fully observable customers, which are then stored111

in an exemplar library. Next, PV generation disaggregation is112

formulated as a nested bi-layer optimization problem: At the113

outer layer, a semi-supervised signal separation (SSS) algorithm114

receives the composite native demand and solar exemplars from115

the inner-layer to disaggregate the native demand and PV gener-116

ation from customers’ net demand. The outer layer of the solar117

disaggregation process is subject to the response of the inner118

layer, at which a learning mechanism is developed to find optimal119

weights that are assigned to candidate native demand and solar120

generation exemplars to construct composite exemplars. This121

mechanism is based on the concept of repeated games with122

vector payoff (RGVP) [22] in game theory. While game theory123

has been previously applied in power system studies [15], [23],124

[24], we have not found application of RGVP theory to address125

the solar disaggregation challenge as presented in this paper.126

The learned weights are continuously updated over time using127

the disaggregation residual as a feedback signal. The purpose of128

this novel adaptive composite exemplar construction strategy is129

to provide optimal response to the volatile and variable behavior130

of customers and solar generation profiles at the grid-edge.131

The proposed method is validated using advanced metering132

infrastructure (AMI) data from our utility partners. In this pa-133

per, vectors are denoted using bold italic lowercase letters and134

matrices are denoted as bold uppercase letters.135

The main contributions are summarized as follows:136� A data-driven learning-based approach is proposed for137

disaggregating BTM PV generation using only smart me-138

ter data. This method has been numerically compared139

with a model-based benchmark and has shown to have140

considerable improvements under incomplete information 141

of BTM PV parameters. 142� To find the hidden native demand and solar power values 143

corresponding to different patterns, a closed loop game- 144

theoretic approach is designed to learn the weights assigned 145

to candidate exemplars for composite exemplar construc- 146

tion. 147� The time-varying weights that are used for exemplar con- 148

struction significantly enhance the adaptability of the dis- 149

aggregator to unknown abnormal BTM events, such as PV 150

failure and unauthorized installation and expansion of solar 151

arrays. 152

The rest of the paper is organized as follows: Section II intro- 153

duces the overall framework of BTM PV generation disaggre- 154

gation approach and describes smart meter dataset. Section III 155

proposes the method for constructing candidate exemplars. Sec- 156

tion IV describes the procedure of disaggregating BTM PV 157

generation and native demand from net demand. In Section V, a 158

game-theoretic learning process is presented to obtain optimal 159

composite PV generation and native demand exemplars. In 160

Section VI, case studies are analyzed and Section VII concludes 161

the paper. 162

II. PROPOSED BTM PV GENERATION DISAGGREGATION 163

FRAMEWORK AND DATASET DESCRIPTION 164

A. Overall Framework of the Proposed Approach 165

In this paper, the customers are classified into three types: (I) 166

SP denotes the set of fully observable end-users without PVs, 167

whose native demands are directly measured by smart meters. 168

Note that the net demand of these customers equals their native 169

demand. (II) SG denotes the set of fully observable customers 170

with PV generation resources. Both the native demand and the 171

solar generation of these customers are measured separately. 172

(III) SN represents the group of customers with BTM PVs 173

and net demand measurements. The native demand and solar 174

generations of these customers are unknown to the utilities. The 175

goal of this paper is to separate aggregate BTM PV generation 176

of groups of customers in SN . 177

The basic idea of the proposed BTM PV generation disaggre- 178

gation approach is based on the observations that (1) the native 179

demand of sufficiently-large groups of customers are highly 180

correlated, (2) the PV generation of customers with similar 181

orientation are highly correlated, and (3) the correlation between 182

native demand and PV generation is very small. These three 183

observations can be corroborated using real native demand and 184

PV generation data. Fig. 1(a) shows the correlation between the 185

native demands of two groups of fully observable customers 186

in SP and SG, where, N1 and N2 denote size of each group. 187

It can be seen that as the number of customers in each group 188

increases, the correlation between the aggregate native demands 189

of the customer rises as well. Fig. 1b illustrates the impact of 190

PV panel azimuth on the pairwise PV generation correlation 191

of customers, where A1 and A2 denote the azimuths of two PV 192

panels. It can be seen that as the difference between the azimuths 193

of PV panels decreases the correlation between the solar power 194

increases significantly. Hence, the similarity in solar generation 195

is mainly due to similar panel orientations as expected [9], [18]. 196

Page 3: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

BU et al.: DATA-DRIVEN GAME-THEORETIC APPROACH FOR BEHIND-THE-METER PV GENERATION DISAGGREGATION 3

Fig. 1. Observed correlations from real smart meter data.

Fig. 2. Overall structure of the proposed BTM PV generation disaggregationmethod.

In contrast to the significant pairwise correlation between the197

native demands of groups of customers and that of BTM solar198

power outputs of PVs with similar orientations, the correlation199

between native demand and BTM PV generation is significantly200

small and less than 0.3. This small correlation can be further201

corroborated by the mismatch between the native demand and202

PV generation peak times. Specifically, PV units generally out-203

put their maximum power during noon, while the native demand204

usually peaks in the afternoon or early evening [25]. The small205

correlation can be explained by the PVs’ zero-output during206

nighttime, which results in a decline in the correlation between207

PV generation and native demand. These three observations set208

the foundation for constructing native demand and PV genera-209

tion exemplars using the data of customers in the sets SP and210

SG, to approximate the unobservable native demand and BTM211

PV generation of customers in the set SN .212

The overall disaggregation process is performed for groups of213

customers connected to the same lateral or secondary distribu-214

tion transformer [6]. The components of this process are shown215

in Fig. 2:216

Component I. Exemplar Library Construction: The library217

consists of typical candidate native demand and solar gener-218

ation profiles. This library is constructed based on the data of219

customers in the sets SP and SG using a spectral clustering (SC)220

algorithm. The SC algorithm automatically identifies customers221

with similar native load patterns and solar generation profiles.222

The previously-discussed correlations are used as a measure of223

similarity within the clustering algorithm. The outcomes of the224

SC method are the native demand and solar power cluster centers225

that are added as candidate exemplars to the library.226

Component II. Composite Exemplar Construction: A227

weighted averaging operation is performed over the candidate228

native demand and PV generation exemplars within the library229

to generate composite native demand and solar generation ex- 230

emplars. 231

Component III. BTM PV Generation Disaggregation: An 232

SSS method is developed to disaggregate the net demand of 233

customers in SN by finding the optimal coefficients that de- 234

termine the share of composite native demand and solar power 235

exemplars within the net demand. The objective of the coefficient 236

optimization process is to minimize the disaggregation residuals. 237

Component IV. Game-theoretic Learning: A RGVP-based 238

learning process is designed to assign and update the weights for 239

each candidate exemplar in the library over time. These updated 240

weights are then used to generate the composite native demand 241

and solar generation exemplars for the next time point (Compo- 242

nent II). This game-theoretic mechanism adaptively revises the 243

behavior of the disaggregator in response to the time-varying 244

solar power and native demand. The disaggregation process 245

can be converted into a nested bi-layer optimization problem 246

formulated as follows: 247

minpppt,gggt,αt,βt

1

2(||pppt − pppCt αt||22 + ||gggt − gggCt βt||22) (1a)

s.t. pppt + gggt = pppnt (1b)

pppCt = [pppc1t , . . . , pppcMt ]ωωω∗t (1c)

gggCt = [gggc1t , . . . , gggcNt ]θθθ∗t (1d)

{ωωω∗t , θθθ∗t} = argminωωωt,θθθt

Φλ(pppnt−1,P

ct ,G

ct , α

∗t−1, β

∗t−1)

(1e)

s.t.M∑

i=1

ωi,t = 1, ωi,t ≥ 0 (1f)

N∑

j=1

θj,t = 1, θj,t ≥ 0 (1g)

where, || · ||2 denotes l2-norm. Note that all the demand and 248

generation variables in this equation are defined over a time 249

window of length T , where a vector xxxt represents data sam- 250

ples of variable x in the time window [t− T + 1, t] as, xxxt = 251

[x(t− T + 1), . . . , x(t)]. The objective of the outer layer is to 252

minimize the summation of the overall disaggregation residuals, 253

consisting of two components: 1) the difference between the 254

actual native demand, pppt, and its alternative epitome pppCt αt, and 255

2) the difference between the actual BTM PV generation, gggt, 256

and its alternative epitome gggCt βt. Here, αt and βt determine the 257

proportions of demand and solar powers within the net demand, 258

for the given composite native demand and solar exemplars, 259

denoted by pppCt and gggCt , respectively. Constraint (1b) ensures 260

that the summation of the native demand and PV generation 261

equals the observed net demand, pppnt , which is measured by 262

AMI. Constraints (1c) and (1d) represent the construction of 263

composite native demand and BTM PV generation exemplars, 264

where pppcit and gggcjt are the candidate native demand and BTM 265

PV generation exemplars, respectively. The composite exem- 266

plar construction process employs the weight vectors, ωωωt = 267

[ω1,t, . . . , ωM,t] and θθθt = [θ1,t, . . . , θN,t], where ωi,t and θj,t 268

Page 4: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

4 IEEE TRANSACTIONS ON POWER SYSTEMS

are the weights corresponding to candidate exemplars pppcit and269

gggcjt , respectively. Note that each candidate exemplar represents270

the typical native demand/solar generation profile in the time271

window [t− T + 1, t], which are stored in the exemplar library.272

The objective of the inner layer, (1e), is to minimize the param-273

eterized potential function, Φλ, of the game-theoretic learning274

process, with parameter λ, to reduce the long-term estimation275

regret, where Pct = [pppc1t , . . . , pppcMt ] and Gc

t = [gggc1t , . . . , pppcNt ] are276

the native demand and solar generation candidate exemplar277

libraries, respectively. λ is a user-defined parameter that deter-278

mines the speed of updating of the weights in the game-theoretic279

framework (i.e., higher λ implies faster updates). Note that the280

inner layer optimizes the weights at time t using the measured281

net demand and the outcome of the outer layer at time t− 1.282

Constraints (1f) and (1g) ensure that the weights assigned to the283

candidate exemplars are non-negative and have l1-norms equal284

to one. The game-theoretic process assigns higher weight values285

to candidate exemplars that have higher impact on reducing the286

overall disaggregation residuals.287

B. Dataset Description288

In this paper, net demand of individual customers is con-289

structed by subtracting real BTM PV generation from real native290

demand. The hourly native demand and PV generation data are291

from Midwest U.S. utilities. The time rage of the dataset is one292

year and it contains 1120 customers and 337 PVs. The nominal293

capacity of PVs ranges from 3 kW to 8 kW. These data are294

available online [26].295

III. CANDIDATE EXEMPLAR LIBRARY CONSTRUCTION296

AND COMPOSITE EXEMPLAR GENERATION297

The time-series data of fully observable customers in SP and298

SG are leveraged to construct the candidate native demand and299

solar generation exemplar library using an SC method [27], [28].300

In performing SC, two graphs are developed independently, each301

corresponding to the typical native demand and the typical PV302

generation libraries. The general steps in constructing the candi-303

date exemplar libraries and composite demand/solar exemplars304

are as follows:305

Step I. Developing similarity graphs: The basic idea of SC306

technique is to solve the clustering problem using graph theory.307

To achieve this, a similarity graph, G = (V,E), is developed308

using AMI data. V denotes the set of vertices of the graph and309

E is the set of edges connecting vertices. In our work, for the310

native demand, the average daily load profiles of customers in311

set SP are defined as the graph vertices. For PV generation, the312

normalized solar power profiles of the PVs in set SG are defined313

as graph vertices. The basic idea is to connect the vertices in V314

that are similar to each other. We have used a Gaussian kernel315

function as a measure of similarity, as shown below:316

Wi,j = exp

(−||Vi −Vj ||22ρiρj

)(2)

where, Wi,j is the weight assigned to the edge connecting317

verticesVi andVj , and ρi and ρj are tunable scaling parameters318

for vertices Vi and Vj . The two vertices, Vi and Vj , are319

connected when the weight of the corresponding edge, Wi,j , 320

is larger than 0 (i.e., they have non-trivial similarities). 321

Step II. Developing Graph Laplacian matrices: Based on 322

similarity graphs obtained from demand/solar power data, the 323

clustering process is transformed into a graph partitioning prob- 324

lem, which cuts a graph into multiple smaller sections by remov- 325

ing edges. The graph partitioning can be conducted in different 326

ways according to different objective functions. In this paper, 327

the objective function is to roughly maximize the dissimilar- 328

ity between the different graph clusters while minimizing the 329

similarity within each cluster [27]: 330

NG = minC1,...,Cµ

μ∑

i=1

ϕ(Ci,V \Ci)

d(Ci)(3)

where, μ is the number of clusters, Ci is the i’th cluster in 331

graph G, V \Ci represents the vertices of V that are not in Ci, 332

ϕ(Ci,V \Ci) represents the sum of the weights in Ci and V \ 333

Ci, d(Ci) denotes the sum of the weights of the vertices in Ci. 334

It has been shown that the partitioning problem (equation (3)) 335

can be solved using the eigenvectors of the normalized Graph 336

Laplacian matrix [27], L, as a reduced-order representation of 337

the original data. The Laplacian is obtained as follows: 338

L = D−12WD−

12 (4)

where, D is a diagonal matrix whose diagonal elements equal 339

the sum of elements in each row of W. To obtain Laplacian 340

eigenvalues, {μ1, μ2, . . ., μn}, and the corresponding eigenvec- 341

tors, eigen-decomposition is performed on L ∈ Rn×n. The first 342

k eigenvectors corresponding to the first k largest eigenvalues 343

are concatenated into a new matrix E ∈ Rn×k. 344

Step III. Obtaining candidate exemplars: The matrixE can be 345

considered as the new representation of dataset, which embeds 346

vertices in a lower-dimension space. It has been shown that 347

this new matrix improves the cluster-properties of the data [27]. 348

Then, k-means algorithm is performed to cluster the rows of E. 349

To find the optimal number of clusters, the modified Hubert 350

Γ statistic index is adopted for calibration [29]. After that, 351

customers in the sets SP and SG are classified into M and N 352

clusters, respectively. The corresponding candidate native de- 353

mand and solar generation exemplars, pppcit and gggcjt , are obtained 354

using the cluster centers, which equal the average demand/solar 355

powers for the customers belonging to each cluster. 356

Step IV. Constructing composite exemplars: The input weights 357

from the RGVP module are used to build composite native 358

demand and PV generation exemplars through an averaging 359

process over the candidate exemplars: 360

pppCt =

M∑

i=1

pppcit ωi,t (5a)

gggCt =

N∑

j=1

gggcjt θj,t (5b)

The weights, ωi,t and θj,t, are obtained from the RGVP-based 361

learning process, which is elaborated in Section V. 362

Page 5: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

BU et al.: DATA-DRIVEN GAME-THEORETIC APPROACH FOR BEHIND-THE-METER PV GENERATION DISAGGREGATION 5

IV. BTM PV GENERATION AND NATIVE363

DEMAND DISAGGREGATION364

Using the constructed composite native demand and solar365

generation exemplars for customers in the sets SP and SG,366

the task of SSS is to estimate the coefficients of the composite367

exemplars, αt and βt, which are unknown a priori. These co-368

efficients determine the optimal disaggregation of the measured369

net demand for customers in SN . This problem is formulated as370

a residual minimization problem:371

minpppt,gggt,αt,βt

1

2(||pppt − pppCt αt||22 + ||gggt − gggCt βt||22) (6a)

s.t. pppt + gggt = pppnt (6b)

where, the disaggregation residual is defined using the l2-norm,372

which yields a convex and differentiable optimization problem373

that can be efficiently represented as a least-squares problem,374

assuming that the measurement noise follows a Gaussian distri-375

bution. Leveraging the negligible correlation between the native376

demand and PV generation, the optimization problem can be377

solved efficiently in real-time using normal equations, which378

are based on introducing Lagrange multipliers to the constraint379

(6b) and employing the gradient of the objective function (6a).380

This process yields the following optimal solutions:381[α∗tβ∗t

]= (Xt

TXt)−1Xt

Tpppnt (7)

where, Xt = [pppCt , gggCt ]. Note that althoughpppt andgggt are decision382

variables of interest in equation (6), they cannot be recovered ex-383

plicitly, as described in [30]. Instead, the optimal coefficients α∗t384

and β∗t can be explicitly optimized and are used to approximate385

the disaggregated native demand, pppt, and BTM PV generation,386

gtgtgt, for customers in the set SN as follows:387

pppt = pppCt α∗t (8a)

gggt = gggCt β∗t (8b)

V. RGVP-BASED WEIGHT LEARNING388

In practice, the weights assigned to the candidate exemplars,389

ωωωt and θθθt, are unknown a priori due to the unobservability of390

real native demand and PV generation of customers in the set391

SN . In this section, a novel adaptive game-theoretic learning392

process is designed to learn these weights and then to generate393

and update optimal composite exemplars over time. The main394

idea is to handle the variations and volatility of unknown native395

demand and BTM solar generation by minimizing the long-term396

disaggregation residuals.397

The weight updating process is cast as a repeated game model398

with vector payoff, in which two components are defined: a399

player and a set of experts [22], which in our problem correspond400

to the disaggregator and the candidate exemplars, respectively.401

The experts provide “advice” (i.e., typical load/solar patterns)402

to the player, who then combines them to obtain the composite403

demand/solar exemplars. To do this, the player assigns weights404

to each constructed candidate exemplar and performs weighted405

averaging to build time-series composite exemplars, pppCt and406

Fig. 3. Detailed structure of the RGVP module from Fig. 2.

gggCt . To optimize these weights, the player determines regret 407

values for the advice of each expert. The regret value for each 408

expert represents the difference between the player’s loss when 409

using the composite exemplars and the loss when using the 410

candidate exemplar. Intuitively, a negative loss indicates how 411

much the player is better off by performing disaggregation using 412

composite exemplars instead of each candidate exemplar. A 413

potential function is used to minimize the magnitude of accu- 414

mulated regret vector over time by employing a gradient-based 415

search process to optimize weight values. After that, the optimal 416

weights are passed on to construct composite exemplars and 417

perform the disaggregation process for customers in the set SN , 418

as described in Sections III and IV. The RGVP steps are shown 419

in Fig. 3 and described as follows: 420

Step I. Initialization: t← t0; uniform distribution is used 421

to initialize the weights assigned to candidate exemplars, i.e., 422

ωi,t ← 1M and θj,t ← 1

N . 423

Step II. Construct the latest composite exemplars: Receive 424

candidate exemplars from exemplar libraries constructed using 425

SC. Then, assignωωωt and θθθt to these candidate exemplars to con- 426

struct composite native demand and PV generation exemplars 427

as shown in Section III. 428

Step III. Disaggregation using the latest composite exemplars: 429

Pass the generated composite exemplars, pppCt and gggCt , to SSS 430

(Section IV) to perform disaggregation. The estimated net de- 431

mand is calculated using the disaggregated native demand and 432

solar generation as follows: 433

ppptn = pppt + gggt (9)

Page 6: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

6 IEEE TRANSACTIONS ON POWER SYSTEMS

Step IV. Determine disaggregation residual: The disaggrega-434

tion residual for the composite exemplars is obtained using the435

measured and estimated net demand as follows:436

eCt = ||ppptn − pppnt ||1 (10)

where, || · ||1 denotes l1-norm.437

Step V. Disaggregation using the latest candidate native de-438

mand exemplars: Instead of using composite native demand439

exemplar for disaggregation, candidate native demand exem-440

plars are leveraged to perform SSS. To do this, the candidate441

native demand and composite PV generation exemplar pairs442

{pppcit , gggCt } are passed to the SSS in parallel, ∀i ∈ {1, . . .,M}.443

The outcomes are the disaggregated native demand and solar444

generation for each pair, denoted as pppcip,t and gggcip,t, respectively.445

These obtained signals are used to reconstruct the net demand446

corresponding to each candidate native demand exemplar, pppn,cip,t ,447

∀i ∈ {1, . . .,M}, as follows:448

pppn,cip,t ← pppcip,t + gggcip,t (11)

Finally, the disaggregation residual corresponding to each can-449

didate native demand exemplar,pppcit , is obtained as shown below:450

ecip,t = ||pppn,cip,t − pppnt ||1 (12)

Step VI. Disaggregation using the latest candidate solar451

generation exemplars: The process introduced in Step V is452

performed symmetrically over candidate solar generation exem-453

plars. Accordingly, the composite native demand and candidate454

PV generation exemplar pairs {pppCt , gggcjt } are passed to the SSS,455

∀j ∈ {1, . . ., N}, where the disaggregated native demand and456

solar generation are obtained, denoted as pppcjg,t and gggcjg,t, respec-457

tively. These disaggregated signals are then used to reconstruct458

the net demand, pppn,cjg,t , corresponding to each candidate solar ex-459

emplar and determine the disaggregation residuals ecjg,t, similar460

to equation (12).461

Step VII. Update candidate regrets: The player’s instanta-462

neous regrets for each native demand and solar generation463

candidate exemplars are calculated as follows:464

rcip,t = eCt − ecip,t i = 1, . . . ,M (13a)

rcjg,t = eCt − e

cjg,t j = 1, . . . , N (13b)

here, rcip,t and rcjg,t represent the regrets for candidate demand and465

solar exemplars, respectively, which are measured in terms of466

the payoffs in disaggregation residuals by following the advice467

of candidate exemplars instead of composite exemplars at time468

t. The cumulative regrets for the i’th candidate native demand469

exemplar and the j’th candidate PV generation exemplar are470

defined by summing up the instantaneous regret rcip,t and rcjg,t for471

all previous time instants in [t0, t], as follows:472

Rcip,t =

t∑

t′=t0

rcip,t′ i = 1, . . . ,M (14a)

Rcjg,t =

t∑

t′=t0

rcjg,t′ j = 1, . . . , N (14b)

TABLE INUMBER OF CUSTOMERS AND PVS IN LATERALS

By assigning accumulated regret values to each expert, the 473

regret vectors are obtained as RRRp,t = [Rc1p,t, . . . , R

cMp,t ]

T and 474

RRRg,t = [Rc1g,t, . . . , R

cNg,t ]

T for candidate native demand and solar 475

generation exemplars, respectively. 476

Step VIII. Update weights: The goal of RGVP is to reduce the 477

magnitude of the accumulated regret vectorsRRRp,t andRRRg,t. To do 478

this, potential functions are assigned to these accumulated vector 479

spaces. These scalar potential functions are increasing with 480

respect to the advisors’ accumulated regrets. Hence, reducing the 481

accumulative regret is transformed into minimizing the values 482

of these potential functions [22]. In this paper, we have adopted 483

exponential potential functions, the gradients of which are used 484

to update the weights as follows [22]: 485

ωi,t+1 = ∇Φλ(RRRp,t)i =eλR

cip,t

∑Mj=1 e

λRcjp,t

i = 1, . . . ,M (15a)

θj,t+1 = ∇Φλ(RRRg,t)j =eλR

cjg,t

∑Ni=1 e

λRcig,t

j = 1, . . . , N (15b)

where, 486

Φλ(uuu) =1

λln

(L∑

i=1

eλui

)(16)

is an exponential potential operator with uuu = [u1, . . . , uL]T, λ 487

is a positive tunable parameter indicating the updating speed of 488

weights, which is adopted from literature as λ =√

8ln(L)/T 489

[22], with L = M or L = N . 490

Step IX. Moving the disaggregation window: t← t+ 1; go 491

back to Step II. 492

An algorithmic overview of the aforementioned steps of BTM 493

PV generation disaggregation is summarized in Algorithm 1. 494

VI. CASE STUDY 495

In this section, the proposed BTM PV generation disaggrega- 496

tion method is verified using real smart meter data described in 497

Section II and the one-line diagram of a 240-node distribution 498

grid presented in [26]. The proposed approach is applied to 499

disaggregate single-phase lateral- and transformer-level PV gen- 500

eration over a one-year data period. The number of customers, 501

NP , and BTM PV generators, NG, connected to the system 502

laterals are shown in Table I. 503

The tunable parameters in the proposed BTM PV generation 504

disaggregation approach include the number of candidate native 505

Page 7: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

BU et al.: DATA-DRIVEN GAME-THEORETIC APPROACH FOR BEHIND-THE-METER PV GENERATION DISAGGREGATION 7

Algorithm 1: BTM PV Generation and Native DemandDisaggregation From Net Demand.

1: procedure Initialization2: t← t0, ωi,t ← 1

M , i ∈ {1, . . .,M}, θj,t ← 1N ,

j ∈ {1, . . ., N}3: end procedure4: Receive {pppc1t , pppc2t , . . . , pppcMt } and {gggc1t , gggc2t , . . . , gggcNt }

from SC5: procedure Perform SSS using pppCt and gggCt6: pppCt ←

∑Mi=1 ppp

cit ωi,t, gggCt ←

∑Nj=1 ggg

cjt θj,t

7: Xt ← [pppCt , gggCt ]

8: {α∗t , β∗t } ← (XtTXt)

−1XtTpppn

t

9: pppt ← pppCt α∗t , gggt ← gggCt β

∗t

10: ppptn ← pppt + gggt

11: end procedure12: procedure Perform SSS using pppcit and gggCt13: Xci

p,t ← [pppcit , gggCt ] i = 1, . . . ,M

14: {αcip,t∗, βci

p,t∗} ← (Xci

p,tTXci

p,t)−1Xci

p,tTpppnt

15: pppcip,t ← pppcit αcip,t∗, gggcip,t ← gggCt β

cip,t∗

16: pppn,cip,t ← pppcip,t + gggcip,t17: end procedure18: procedure Perform SSS using pppCt and ggg

cjt

19: Xcjg,t ← [pppCt , ggg

cjt ] j = 1, . . . , N

20: {αcjg,t∗, β

cjg,t∗} ← (X

cjg,t

TX

cjg,t)−1Xcj

g,tTpppnt

21: pppcjg,t ← pppCt α

cjg,t∗, gggcjg,t ← ggg

cjt β

cjg,t∗

22: pppn,cjg,t ← ppp

cjg,t + ggg

cjg,t

23: end procedure24: procedure Update Regret and Weights (Demand)25: rcip,t = ||ppptn − pppnt ||1 − ||pppn,cip,t − pppnt ||126: Rci

p,t =∑t

t′=t0rcip,t′

27: ωi,t+1 ← eλRcip,t/

∑Mj=1 e

λRcjp,t i = 1, . . . ,M

28: end procedure29: procedure Update Regret and Weights (PV)30: r

cjg,t = ||ppptn − pppnt ||1 − ||pppn,cjg,t − pppnt ||1

31: Rcjg,t =

∑tt′=t0

rcjg,t′

32: θj,t+1 ← eλRcjg,t/

∑Ni=1 e

λRcig,t j = 1, . . . , N

33: end procedure34: t← t+ 135: Go to Step 4

demand exemplars, M , the number of candidate PV generation506

exemplars, N , and the disaggregation time-window length, T .507

To optimize the number of clusters, M and N , the modified508

Hubert Γ statistic index is calculated for different number of509

clusters by running SC [29]. Then, the optimal values of M510

and N are determined by finding the knee point of Γ curve as511

presented in [31]. In our case, M and N are optimized at 4 and512

3, respectively. To tune the length of the moving time window, a513

grid search method was employed to find the minimum net de-514

mand estimation residual in terms of mean absolute percentage515

error (MAPE), calculated as follows:516

MAPE =100%

K∑

t=1

∣∣∣∣∣pn(t)− pn(t)1K

∑Kt=1 |pn(t)|

∣∣∣∣∣ (17)

where, K is the total number of net demand samples. In our 517

case, the identified optimal value of T is 96 hours. 518

These calibrated parameters are then used to perform BTM 519

PV generation disaggregation. Fig. 5 shows the disaggregated 520

PV generation and native demand for a lateral within a one-week 521

period. It can be seen in Fig. 5(a) that the disaggregated PV gen- 522

eration closely fits the actual unobservable solar power, which 523

indicates a high disaggregation accuracy. Also, as demonstrated 524

by the disaggregated and actual PV generation curves during 525

the last day, despite high variations in PV generation profile, the 526

proposed approach can still provide satisfying accuracy. The 527

disaggregated native demand and the actual native demand are 528

shown in Fig. 5b, where the disaggregated native demand can 529

accurately capture the load variations as well. 530

The proposed approach has been applied to 19 laterals in 531

a real distribution system to test its generalizability. The lat- 532

eral divisions are illustrated in Fig. 4, where the nodes that 533

have observable PVs are shown with brown color. The smart 534

meter data from individual customers are aggregated to obtain 535

lateral-level demand profiles, in which the utilities have shown 536

significant interest. In our case study 5% of all customers are 537

fully observable with known PV generation and native demand 538

data (the set SG). The remaining 95% belong to the sets SN and 539

SP , where only either the net demand or the native demand is 540

observable, respectively. The idea is to use the data of SG and 541

SP to disaggregate PV generation of SN . Using the actual PV 542

generation and native demand as ground truths, the fitness of the 543

disaggregated PV generation and native demand are evaluated 544

in terms of MAPE, as shown in Table II, where the solar and 545

demand disaggregation accuracy are denoted as gm and pm, 546

respectively. As can be seen, gm ranges from 4% to 8%, and 547

pm ranges from 6% to 10%. Note that in the case of solar 548

disaggregation using feeder-level demand profile measurements, 549

e.g., SCADA power flow data, the network losses should be 550

removed before disaggregation. To do this, previous works have 551

proposed several power flow-based techniques to estimate and 552

eliminate the network losses [32], [33]. After this, PV generation 553

disaggregation can be performed using our proposed method. 554

The case study is conducted on a standard PC with an Intel(R) 555

Xeon(R) CPU running at 3.70 GHz and with 32.0 GB of RAM. 556

PV generation disaggregation is performed for each lateral illus- 557

trated in Fig. 4 over a year, and the computational time ranges 558

from 4.20 seconds to 4.64 seconds. 559

Furthermore, it is of interest to examine the variations of 560

the obtained game-theoretic weights corresponding to different 561

candidate exemplars. In Fig. 6a, it can be seen that the weights 562

assigned to the three candidate PV generation exemplars for 563

one of the laterals converge to approximately 0.8, 0.1, and 0.1, 564

through the learning process, after which the weight values 565

remain nearly stable. Similar characteristics can be observed 566

in Fig. 6b, which shows the variations of weights assigned to 567

the candidate native demand exemplars for the same lateral. 568

To validate RGVP, Fig. 8 employs error histograms to demon- 569

strate the performance of the disaggregation process correspond- 570

ing to RGVP-based composite exemplars and individual candi- 571

date exemplars, where egt = g(t)− g(t) and ept= p(t)− p(t) 572

represent solar and demand disaggregation errors, respectively. 573

As can be seen, the error distribution under composite exemplars 574

Page 8: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

8 IEEE TRANSACTIONS ON POWER SYSTEMS

Fig. 4. One-line diagram of a real distribution system.

Fig. 5. PV generation and native demand profiles of a lateral within a one-weekperiod.

for both solar and native demand show considerably lower575

variance (i.e., higher precision), compared to those of candidate576

exemplars. This implies that using the game-theoretic learning577

process to construct composite exemplars leads to improvements578

in the disaggregation accuracy on average.579

TABLE IIPV GENERATION AND NATIVE DEMAND DISAGGREGATION MAPE

Since the proposed disaggregator depends on exemplary pro- 580

files, it is critical to conduct additional numerical analysis to 581

capture the sensitivity of disaggregation accuracy with respect 582

to the number of observable customers in SG and SP . This 583

has been done by reducing the percentage of observable PVs, 584

ng , from 1.5% to 0.1% in 0.1% steps (1 PV per step) and 585

performing disaggregation at each step. The average MAPEs 586

of disaggregated PV generation and native demand are plotted 587

against the percentage of observable PVs, as shown in Fig. 7. As 588

can be seen in Fig. 7a, once ng drops below 0.3%, the average 589

MAPE significantly increases. In the worst case, when only 590

1 customer is observable (ng ≈ 0.1%), the average MAPE is 591

about 28%. This is consistent with expectations: (1) Only a 592

single observable PV cannot represent all other PVs, since the 593

PV panel orientation has significant impact on PV generation 594

profile; (2) As ng increase from 0.1% to 0.3%, the average 595

MAPE significantly decreases, because the unobservable PVs 596

can be better represented using more diverse PV generation 597

profiles; (3) When ng is larger than 0.3%, further increase in 598

Page 9: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

BU et al.: DATA-DRIVEN GAME-THEORETIC APPROACH FOR BEHIND-THE-METER PV GENERATION DISAGGREGATION 9

Fig. 6. The time-series weights assigned to the candidate PV generation andnative demand exemplars.

Fig. 7. Sensitivity analysis for the number of observable PVs.<

ng does not lead to any noticeable accuracy improvements since599

the redundant exemplary PV generation profiles do not contain600

much additional information. In Fig. 7b, a similar decreasing601

trend in average MAPE for the disaggregated native demand602

can also be observed, which is consistent with the constraint603

in Equation (1b). Also, we have performed numerical sensi-604

tivity analysis to capture the impact of number of observable605

Fig. 8. Error distribution under composite exemplar and candidate exemplars.

Fig. 9. MAPE comparison using RGVP-based SSS and DD-based SSS.

customers in SP on disaggregation accuracy. Similar to the 606

case of SG, as the number of customers with observable na- 607

tive demand increases the disaggregation accuracy improves. 608

In our case studies, the minimum required customers with 609

observable demand is 18%. Note that these numbers are case- 610

dependent, and vary for different customer behaviors in different 611

regions. 612

Page 10: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

10 IEEE TRANSACTIONS ON POWER SYSTEMS

Fig. 10. Adaptability of the proposed data-driven approach.

To further demonstrate the advantage of our proposed ap-613

proach, we have conducted simulations in two scenarios where614

a certain percentage of BTM PVs (20% and 40%) have615

stopped running without the utility’s knowledge. A model-based616

method [34] has been used as a benchmark for comparison with617

our data-driven technique. In Fig. 10, the real PV generation618

curves, the disaggregated curves from the data-driven method619

and the estimated curves from PV model are plotted with dif-620

ferent PV failure percentages for comparison. As can be seen,621

regardless of the percentage of faulty PVs, the model-based622

method cannot detect PV failure and cannot adjust to PV gener-623

ation estimation. This inflexibility is due to the model’s inability624

to adapt to changes in system conditions, which are BTM and625

unknown. In contrast, our data-driven approach displays high626

adaptability to PVs’ failure conditions. Specifically, although the627

real PV generation decreases to a lower level due to PV failure,628

our data-driven disaggregator can track this change after a short629

transition phase. The small difference between the real and the630

estimated curves, both before the PVs’ failure and after the631

transition, demonstrates satisfying disaggregation accuracy. The632

adaptability of the proposed disaggregator can also be extended633

to unauthorized PV installation and expansion, since essentially634

both partial failure and installation can be translated into changes635

in BTM capacity of generation.636

An alternative approach to the proposed RGVP is to directly637

perform PV generation disaggregation using candidate exem-638

plars without developing composite exemplars. For abbrevia-639

tion, we denote this approach as “direct disaggregation-based640

SSS” (DD-based SSS). The solar-demand disaggregation accu-641

racy of the DD-based SSS and the proposed RGVP-based SSS642

are shown in Fig. 9. It can be seen that the proposed RGVP-based643

Fig. 11. Estimation MAPE distribution for PV generation and native demandat secondary transformer level.

SSS outperforms DD-based SSS in terms ofMAPE. The reason 644

for this better performance is that the RGVP-based method can 645

identify the candidate exemplars that are highly correlated with 646

the BTM real load/solar powers. 647

The proposed BTM PV generation and native demand disag- 648

gregation approach is also applied to secondary transformers. 649

Fig. 11 shows the transformer-level disaggregation MAPE 650

distribution. It can be seen that the proposed method is able to 651

achieve an average solar disaggregation MAPE of 10.0%, with 652

an average native demand disaggregation MAPE of 15.0%. As 653

can be seen, disaggregation at transformer-level results in higher 654

residuals compared to lateral-level due to increased grid-edge 655

demand volatility. 656

VII. CONCLUSION 657

This paper presents a non-intrusive novel RGVP-based ap- 658

proach to disaggregate BTM solar generation from the net 659

Page 11: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

BU et al.: DATA-DRIVEN GAME-THEORETIC APPROACH FOR BEHIND-THE-METER PV GENERATION DISAGGREGATION 11

demand. The proposed method employs the data of fully observ-660

able customers to identify typical demand/generation patterns,661

and optimally combines these patterns to improve disaggrega-662

tion performance over time. We have used real smart meter663

data and practical distribution system models from our utility664

partners to show that this technique is able to enhance solar665

disaggregation accuracy by adaptively updating the estimator’s666

response to volatile BTM resources. This can enhance utilities’667

situational awareness of grid-edge resources without incurring668

extra metering investment costs. The key findings of the paper669

are summarized as follows:670� Using real smart meter data, we have observed that: the671

native demand of any two sizable groups of customers are672

highly correlated; any two PV generation profiles with sim-673

ilar orientations are significantly correlated; the correlation674

between PV generation and native demand is insignificant.675

Based on these three observations, we have proposed a676

novel data-driven PV generation disaggregator which only677

relies on utilities’ existing smart meter data to separate678

native demand and BTM solar power.679� Numerical experiments have demonstrated that our ap-680

proach can accurately perform solar generation disaggre-681

gation without knowing the specific parameters of BTM682

PV array and inverters, or weather information. This gives683

our method a considerable edge over parameter-dependent684

model-based techniques.685� The numerical experiments have also verified that the686

proposed disaggregator shows strong robustness and adapt-687

ability to unobservable BTM abnormal events, such as PV688

failure and unauthorized PV array installation/expansion.689

The proposed approach shows satisfactory performance on690

feeder/lateral-level PV generation disaggregation in which our691

utility partners have shown great interest; however, when ap-692

plied to individual customers’ data, the disaggregation accuracy693

declines. In the future, we intend to address this challenge.694

REFERENCES695

[1] Y. Wang, N. Zhang, Q. Chen, D. S. Kirschen, P. Li, and Q. Xia, “Data-696driven probabilistic net load forecasting with high penetration of behind-697the-meter PV,” IEEE Trans. Power Syst., vol. 33, no. 3, pp. 3255–3264,698May 2018.699

[2] F. Wang, K. Li, C. Liu, Z. Mi, M. Shafie-Khah, and J. P. S. Catalo, “Syn-700chronous pattern matching principle-based residential demand response701baseline estimation: Mechanism analysis and approach description,” IEEE702Trans. Smart Grid, vol. 9, no. 6, pp. 6972–6985, Nov. 2018.703

[3] R. Seguin, J. Woyak, D. Costyk, J. Hambrick, and B. Mather, High704Penetration PV Integration Handbook for Distribution Engineers. Golden,705CO, USA: Nat. Renewable Energy Lab., 2016.706

[4] B. Chen, C. Chen, J. Wang, and K. L. Butler-Purry, “Sequential service707restoration for unbalanced distribution systems and microgrids,” IEEE708Trans. Power Syst., vol. 33, no. 2, pp. 1507–1520, Mar. 2018.709

[5] X. Zhang and S. Grijalva, “A data-driven approach for detection and710estimation of residential PV installations,” IEEE Trans. Smart Grid, vol. 7,711no. 5, pp. 2477–2485, Sep. 2016.712

[6] W. Kersting, Distribution System Modeling and Analysis. Boca Raton, FL,713USA: CRC Press, 2011.714

[7] F. Ding and B. Mather, “On distributed PV hosting capacity estimation,715sensitivity study and improvement,” IEEE Trans. Sustain. Energy, vol. 8,716no. 3, pp. 1010–1020, Jul. 2017.717

[8] Q. Zhang, J. Zhang, and C. Guo, “Photovoltaic plant metering monitoring718model and its calibration and parameter assessment,” in Proc. IEEE Power719Energy Soc. General Meeting, Jul. 2012, pp. 1–7.720

[9] D. Chen and D. Irwin, “Sundance: Black-box behind-the-meter solar 721disaggregation,” in Proc. e-Energy, May 2017, pp. 16–19. 722

[10] F. Wang et al., “A distributed PV system capacity estimation approach 723based on support vector machine with customer net load curve features,” 724Energies, vol. 11, no. 7, Jul. 2018, Art. no. 1750. 725

[11] Y. Zhao, R. Ball, J. Mosesian, J.-F. de Palma, and B. Lehman, “Graph- 726based semi-supervised learning for fault detection and classification in 727solar photovoltaic arrays,” IEEE Trans. Power Electron., vol. 30, no. 5, 728pp. 2848–2858, May 2015. 729

[12] W. Yi, C. Qixin, H. Tao, and K. Chongqing, “Review of smart meter 730data analytics: Applications, methodologies, and challenges,” IEEE Trans. 731Smart Grid, vol. 10, no. 3, pp. 3125–3148, Mar. 2018. 732

[13] K. Dehghanpour, Z. Wang, J. Wang, Y. Yuan, and F. Bu, “A survey on 733state estimation techniques and challenges in smart distribution systems,” 734IEEE Trans. Smart Grid, vol. 10, no. 2, pp. 2312–2322, Sep. 2018. 735

[14] M. Khodayar, J. Wang, and Z. Wang, “Energy disaggregation via deep 736temporal dictionary learning,” IEEE Trans. Neural Netw. Learn. Syst, to 737be published. Q2738

[15] W. Liu, W. Gu, J. Wang, W. Yu, and X. Xi, “Game theoretic non- 739cooperative distributed coordination control for multi-microgrids,” IEEE 740Trans. Smart Grid, vol. 9, no. 6, pp. 6986–6997, Nov. 2018. 741

[16] H. Shaker, H. Zareipour, E. Muljadi, and D. Wood, “A data-driven ap- 742proach for estimating the power generation of invisible solar sites,” IEEE 743Trans. Smart Grid, vol. 7, no. 5, pp. 2466–2476, Sep. 2016. 744

[17] E. C. Kara, C. M. Roberts, M. D. Tabone, L. Alvarez, D. S. Callaway, 745and E. M. Stewart, “Disaggregating solar generation from feeder-level 746measurements,” Sustain. Energy, Grids Netw., vol. 13, 2018. Q3747

[18] F. Sossan, L. Nespoli, V. Medici, and M. Paolone, “Unsupervised disag- 748gregation of photovoltaic production from composite power flow measure- 749ments of heterogeneous prosumers,” IEEE Trans. Ind. Informat., vol. 14, 750no. 9, pp. 3904–3913, Sep. 2018. 751

[19] C. Dinesh, S. Welikala, Y. Liyanage, M. P. B. Ekanayake, R. I. 752Godaliyadda, and J. Ekanayake, “Non-intrusive load monitoring under 753residential solar power influx,” Appl. Energy, vol. 205, pp. 1068–1080, 754Aug. 2017. 755

[20] G. S. Ledva, L. Balzano, and J. L. Mathieu, “Real-time energy disaggrega- 756tion of a distribution feeder’s demand using online learning,” IEEE Trans. 757Power Syst., vol. 33, no. 5, pp. 4730–4740, Sep. 2018. 758

[21] W. Kong, Z. Y. Dong, J. Ma, D. Hill, J. Zhao, and F. Luo, “An extensible 759approach for non-intrusive load disaggregation with smart meter data,” 760IEEE Trans. Smart Grid, vol. 9, no. 4, pp. 3362–3372, Jul. 2018. 761

[22] M. Maschler, E. Solan, and S. Zamir, Game Theory. Cambridge, U.K.: 762Cambridge Univ. Press, 2013. 763

[23] L. Han, T. Morstyn, and M. McCulloch, “Incentivizing prosumer coalitions 764with energy management using cooperative game theory,” IEEE Trans. 765Power Syst., vol. 34, no. 1, pp. 303–313, Jan. 2019. 766

[24] T. Dai and W. Qiao, “Finding equilibria in the pool-based electricity market 767with strategic wind power producers and network constraints,” IEEE Trans. 768Power Syst., vol. 32, no. 1, pp. 389–399, Dec. 2016. 769

[25] T. A. Short, Electric Power Distribution Handbook. Boca Raton, FL, USA: 770CRC Press, 2014. 771

[26] F. Bu, Y. Yuan, Z. Wang, K. Dehghanpour, and A. Kimber, “A time- 772series distribution test system based on real utility data,” 2019. [Online]. 773Available: https://arxiv.org/abs/1906.04078 774

[27] U. von Luxburg, “A tutorial on spectral clustering,” Statist Comput., 775vol. 17, pp. 395–416, 2007. 776

[28] F. R. K. Chung, Spectral Graph Theory. Providence, RI, USA: American 777Mathematical Society, 1997. 778

[29] M. Halkidi, Y. Batistakis, and M. Vazirgiannis, “On clustering validation 779techniques,” J. Intell. Inf. Syst., vol. 17, pp. 107–145, 2001. 780

[30] M. Wytock and J. Z. Kolter, “Contextually supervised source separation 781with application to energy disaggregation,” in Proc. AAAI Conf. Artif. 782Intell., Jul. 2014, pp. 486–492. 783

[31] S. Salvador and P. Chan, “Determining the number of clusters/segments in 784hierarchical clustering/segmentation algorithms,” in Proc. 16th IEEE Int. 785Conf. Tools AI, 2004, pp. 576–584. 786

[32] L. Queiroz, M. A. Roseli, C. Cavellucci, and C. Lyra, “Energy losses esti- 787mation in power distribution systems,” IEEE Trans. Power Syst., vol. 27, 788no. 4, pp. 1879–1887, Apr. 2012. 789

[33] R. S. Rao, K. Ravindra, K. Satish, and S. V. L. Narasimham, “Power loss 790minimization in distribution system using network reconfiguration in the 791presence of distributed generation,” IEEE Trans. Power Syst., vol. 28, no. 1, 792pp. 317–325, Feb. 2013. 793

[34] D. L. King, W. E. Boyson, and J. A. Kratochvil, Photovoltaic Array 794Performance Model. Albuquerque, NM: Sandia National Labs., 2004. 795

Page 12: IEEE TRANSACTIONS ON POWER SYSTEMS 1 A Data-Driven …wzy.ece.iastate.edu/CV/c2.pdf · IEEE Proof 2 IEEE TRANSACTIONS ON POWER SYSTEMS 83 to the grid. In [19], a non-intrusive load

IEEE P

roof

12 IEEE TRANSACTIONS ON POWER SYSTEMS

Fankun Bu (Student Member, IEEE) received the796B.S. and M.S. degrees from North China Electric797Power University, Baoding, China, in 2008 and 2013,798respectively. He is currently working toward the Ph.D.799degree with the Department of Electrical and Com-800puter Engineering, Iowa State University, Ames, IA,801USA. From 2008 to 2010, he was a Commissioning802Engineer with NARI Technology Co., Ltd., Nanjing,803China. From 2013 to 2017, he was an Electrical804Engineer with State Grid Corporation of China at805Jiangsu, Nanjing, China. His research interests in-806

clude distribution system modeling, smart meter data analytics, renewable807energy integration, and power system relaying.808

809

Kaveh Dehghanpour (Member, IEEE) received the810B.Sc. and M.S. degrees in electrical and computer811engineering from the University of Tehran, Tehran,812Iran, in 2011 and 2013, respectively, and the Ph.D.813degree in electrical engineering from Montana State814University, Bozeman, MT, USA, in 2017. He is cur-815rently a Postdoctoral Research Associate with Iowa816State University, Ames, IA, USA. His research in-817terests include application of machine learning and818data-driven techniques in power system monitoring819and control.820

821

Yuxuan Yuan (Student Member, IEEE) received the822B.S. degree in electrical and computer engineering in8232017 from Iowa State University, Ames, IA, USA,824where he is currently working toward the Ph.D. de-825gree. His research interests include distribution sys-826tem state estimation, synthetic networks, data analyt-827ics, and machine learning.828

829

Zhaoyu Wang (Member, IEEE) received the B.S. 830and M.S. degrees in electrical engineering from 831Shanghai Jiaotong University, Shanghai, China, in 8322009 and 2012, respectively, and the M.S. and Ph.D. 833degrees in electrical and computer engineering from 834the Georgia Institute of Technology, Atlanta, GA, 835USA, in 2012 and 2015, respectively. He is currently 836the Harpole-Pentair Assistant Professor with Iowa 837State University, Ames, IA, USA. He was a Research 838Aid with Argonne National Laboratory in 2013 and 839an Electrical Engineer Intern with Corning, Inc., in 840

2014. His research interests include power distribution systems and microgrids, 841particularly on their data analytics and optimization. He is the Principal Inves- 842tigator for a multitude of projects focused on these topics and funded by the 843National Science Foundation, the Department of Energy, National Laboratories, 844PSERC, and Iowa Energy Center. He is the Secretary of IEEE Power and Energy 845Society (PES) Award Subcommittee, a Co-Vice Chair of PES Distribution 846System Operation and Planning Subcommittee, and the Vice Chair of PES Task 847Force on Advances in Natural Disaster Mitigation Methods. He is an Editor 848for the IEEE TRANSACTIONS ON POWER SYSTEMS, IEEE TRANSACTIONS ON 849SMART GRID, IEEE PES LETTERS, and IEEE OPEN ACCESS JOURNAL OF POWER 850AND ENERGY, and an Associate Editor of IET Smart Grid. 851

852

Yingchen Zhang (Senior Member, IEEE) received 853the B.S. degree from Tianjin University, Tianjin, 854China, in 2003, and the Ph.D. degree from Virginia 855Polytechnic Institute and State University, Blacks- 856burg, VA, USA, in 2010. He is a Chief Scientist and 857a Manager of the Sensing and Predictive Analytics 858Group, National Renewable Energy Laboratory. He 859is also a Visiting Research Assistant Professor with 860the University of Denver and an Adjunct Faculty with 861Colorado State University. He has authored or coau- 862thored more than 100 peer-reviewed publications and 863

holds one U.S. patent. His key areas of expertise lie in sensing and measurement 864for power systems, artificial intelligence, predictive analytics, advanced energy 865management system for future grids, the impact of large-scale integration of 866renewable energies on power system operations. 867

He previously worked for the California ISO developing and implementing 868State Estimator and Network Applications. He serves as an Editor for the IEEE 869TRANSACTIONS ON SUSTAINABLE ENERGY and IEEE POWER ENGINEERING 870LETTERS. 871

872