clustering of electricity distribution systems for ...polito.it... · feeder performance and the...

1
Clustering of Electricity Distribution Systems for Performance Analysis Yang Zhang Supervisor: Prof. Ettore Bompard List of attended classes 01LGSRV Characterization and planning of small-scale multigeneration (2017-09) 01QSNRV Energy security in EU: Methodological approaches and policy (2017-06) 01RISRV Public speaking (2017-09) 02LWHRV Communication (2017-09) 01RQXRV Pattern recognition and neural networks (2016-11) 02PKLRQ Ottimizzazione in condizioni di incertezza: modellazione e metodi di (2017-04) Novel contributions Data cleaning Suppose there are n feeders (X1, X2, … Xn) with p features in the data set X. The Minkowski metric is a common method to evaluate the dissimilarity between objects. To make the selected features in a comparable range, the normalization approach should be applied on the numeric features. For categorical values, the dissimilarity measure between two objects can be defined by the mismatches of the corresponding features, which is known as the Gower’s distance. Addressed research questions/problems According to the data from Enel distribution company in Italy, there are more than 20000 MV feeders (15kV and 20kV) spread around the whole nation (except the Trentino-Alto Adig Region and Valle D'Aosta Region) as shown in Fig 2. Due to different situation and circumstances of each region, these feeders have diverse properties. The failure report of MV feeders in 2014 from Enel reveals that the average interruption frequency for each feeder vary widely in different regions as shown in Fig 3. For example, Lombardia has the largest number of feeders and lowest frequency of interruptions while the number of interruptions is 10 times that of the feeders in Sicilia. Data mining is an efficient analytic technology to sort out the mixed data in a rational way and extract representative information from complicated data sources. In our case, we use clustering algorithm to find out the taxonomy of typical feeders. The relationship between feeder performance and the taxonomy can be revealed. Numerical structural features for each feeder Non-numerical structural features for each feeder Research context and motivation As the terminal of power grid, distribution feeders access directly to users and have an important role in the quality of power supply. Due to different profiles of customers and purposes for infrastructure planning, the distribution lines vary widely in their structural features, including the number of customers, capacity and neutral grounding modes, et al. Nowadays, distribution network is facing critical challenges due to the large penetration of renewable energy sources and increasing application of electric vehicles as shown in Fig 1. Compared with transmission system, these new technologies are inclined to use the low-voltage-level and widely-spread distribution system as interface to energy network, which brings the complexity and uncertainty to the network. However, the number of sensors in distribution network is limited and the sampling interval of smart meters is usually too large for the algorithms to give a rational real-time analysis. It also takes too much time for the distribution system operators (DSOs) to have a detailed analysis for each unique feeder. Adopted methodologies Clustering algorithms (1) PAM algorithm aims at searching for k representative objects as medoids in the data set. Each cluster is constructed with one medoid and the nearest data points around it. The best k medoids will achieve the minimum sum of the dissimilarities of observations to their closest representative object. (2) Hierarchical clustering is an alternative approach which aims to build a hierarchy of clusters because the clustering results are presented in a dendrogram. Best clustering number The average silhouette coefficient and Calinski-Harabasz indices are effective indicators to evaluate the clustering result. Both indexes can be regarded as the quotient of distance between groups divided by the compactness of inside group objects. Future work Predictive maintenance PhD program in Electrical, Electronics and Communications Engineering XXXII Cycle Fig. 1 Distribution network with renewable energy sources and electric vehicles Fig. 2 Number of MV feeders (Enel) in Different Regions Fig. 3 Interruption Percentage of MV feeders (Enel) in Different Regions Feature Description Length Total length of the feeder Cable% Percentage of underground cable in a feeder Nodes Number of nodes in a feeder Branches Number of branches in a feeder Customers Number of customers in a feeder Sec Sub Number of secondary substations in a feeder Auto Nodes Number of nodes with automation equipment MV/LV Trans Number of MV/LV transformers in a feeder Capacity Apparent power in a feeder Neutral Types Neutral grounding mode of a feeder Auto Types Automation types of a feeder Neutral Grounding Mode Isolated Resistance Fixed Coils Adjustable Coils Fixed+Adjustable Coils Automation Types FNC FRG FNC+ICS FRG+ICS ICS None 1 , , p i j ik jk k dX X x x 0 , 1 i j i j i j x x xx x x In statistics, the relevance between two variables can be evaluated by the Pearson correlation coefficient. ICS automation and resistance grounding mode are both eliminated from the data due to the rare probability of occurrence s shown from bar plot. Since the clustering technique is sensitive to the skewed data distributions, the square root method is applied to the highly skewed distribution of numerical features. 1 1 , m m p m i j ik jk i j m k d X X x x x x Fig. 7 Pam results Fig. 8 Hierarchical results Fig. 5 Scatter plot of numerical features Fig. 6 Pearson correlation coefficients Fig. 9 Average silhouette coefficient Fig. 10 Calinski-Harabasz index Unlike the time-based preventive maintenance, predictive maintenance (or condition-based maintenance) is planned when need arises. The concept of this strategy is to find out the parameters indicating a potential failure of an equipment with data-driven techniques. Then the maintenance plan can be optimized with a reduction of cost and outage. Fig. 10 Calinski-Harabasz index Fig.4 Neutral grounding types for each Automation Type

Upload: others

Post on 23-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clustering of Electricity Distribution Systems for ...polito.it... · feeder performance and the taxonomy can be revealed. •Numerical structural features for each feeder ... distribution

Clustering of Electricity Distribution Systems for Performance Analysis

Yang Zhang

Supervisor: Prof. Ettore Bompard

List of attended classes• 01LGSRV Characterization and planning of small-scale multigeneration (2017-09)

• 01QSNRV Energy security in EU: Methodological approaches and policy (2017-06)

• 01RISRV Public speaking (2017-09)

• 02LWHRV Communication (2017-09)

• 01RQXRV Pattern recognition and neural networks (2016-11)

• 02PKLRQ Ottimizzazione in condizioni di incertezza: modellazione e metodi di (2017-04)

Novel contributions• Data cleaning

• Suppose there are n feeders (X1, X2, … Xn) with p features in the data set X. The

Minkowski metric is a common method to evaluate the dissimilarity between objects.

• To make the selected features in a comparable range, the normalization approach should

be applied on the numeric features.

• For categorical values, the dissimilarity measure between two objects can be defined by

the mismatches of the corresponding features, which is known as the Gower’s distance.

Addressed research questions/problems• According to the data from Enel distribution company in Italy, there are more than 20000

MV feeders (15kV and 20kV) spread around the whole nation (except the Trentino-Alto

Adig Region and Valle D'Aosta Region) as shown in Fig 2. Due to different situation and

circumstances of each region, these feeders have diverse properties.

• The failure report of MV feeders in 2014 from Enel reveals that the average interruption

frequency for each feeder vary widely in different regions as shown in Fig 3. For example,

Lombardia has the largest number of feeders and lowest frequency of interruptions while

the number of interruptions is 10 times that of the feeders in Sicilia.

• Data mining is an efficient analytic technology to sort out the mixed data in a rational way

and extract representative information from complicated data sources. In our case, we use

clustering algorithm to find out the taxonomy of typical feeders. The relationship between

feeder performance and the taxonomy can be revealed.

• Numerical structural features for each feeder

• Non-numerical structural features for each feeder

Research context and motivation• As the terminal of power grid, distribution feeders access directly to users and have an

important role in the quality of power supply.

• Due to different profiles of customers and purposes for infrastructure planning, the

distribution lines vary widely in their structural features, including the number of customers,

capacity and neutral grounding modes, et al.

• Nowadays, distribution network is facing critical challenges due to the large penetration of

renewable energy sources and increasing application of electric vehicles as shown in Fig

1. Compared with transmission system, these new technologies are inclined to use the

low-voltage-level and widely-spread distribution system as interface to energy network,

which brings the complexity and uncertainty to the network.

• However, the number of sensors in distribution network is limited and the sampling interval

of smart meters is usually too large for the algorithms to give a rational real-time analysis.

It also takes too much time for the distribution system operators (DSOs) to have a detailed

analysis for each unique feeder.

Adopted methodologies• Clustering algorithms

(1) PAM algorithm aims at searching for k representative objects as medoids in the data

set. Each cluster is constructed with one medoid and the nearest data points around it. The

best k medoids will achieve the minimum sum of the dissimilarities of observations to their

closest representative object.

(2) Hierarchical clustering is an alternative approach which aims to build a hierarchy of

clusters because the clustering results are presented in a dendrogram.

• Best clustering number

The average silhouette coefficient and Calinski-Harabasz indices are effective indicators

to evaluate the clustering result. Both indexes can be regarded as the quotient of distance

between groups divided by the compactness of inside group objects.

Future work• Predictive maintenance

PhD program in

Electrical, Electronics and

Communications Engineering

XXXII Cycle

Fig. 1 Distribution network with renewable energy sources and electric vehicles

Fig. 2 Number of MV feeders (Enel) in Different Regions Fig. 3 Interruption Percentage of MV feeders (Enel) in Different Regions

Feature Description

Length Total length of the feeder

Cable% Percentage of underground cable in a feeder

Nodes Number of nodes in a feeder

Branches Number of branches in a feeder

Customers Number of customers in a feeder

Sec Sub Number of secondary substations in a feeder

Auto Nodes Number of nodes with automation equipment

MV/LV Trans Number of MV/LV transformers in a feeder

Capacity Apparent power in a feeder

Neutral Types Neutral grounding mode of a feeder

Auto Types Automation types of a feeder

Neutral Grounding Mode

Isolated Resistance Fixed Coils Adjustable CoilsFixed+Adjustable

CoilsAutomation

TypesFNC FRG FNC+ICS FRG+ICS ICS None

1

, ,p

i j ik jk

k

d X X x x

0,

1

i j

i j

i j

x xx x

x x

In statistics, the relevance between two variables can be

evaluated by the Pearson correlation coefficient. ICS automation

and resistance grounding mode are both eliminated from the data

due to the rare probability of occurrence s shown from bar plot.• Since the clustering technique is sensitive to the skewed data

distributions, the square root method is applied to the highly skewed

distribution of numerical features.

1

1

,

m mp

m i j ik jk i j mk

d X X x x x x

Fig. 7 Pam results Fig. 8 Hierarchical results

Fig. 5 Scatter plot of numerical features

Fig. 6 Pearson correlation coefficients

Fig. 9 Average silhouette coefficient Fig. 10 Calinski-Harabasz index

Unlike the time-based preventive maintenance,

predictive maintenance (or condition-based

maintenance) is planned when need arises. The

concept of this strategy is to find out the

parameters indicating a potential failure of an

equipment with data-driven techniques. Then the

maintenance plan can be optimized with a

reduction of cost and outage. Fig. 10 Calinski-Harabasz index

Fig.4 Neutral grounding types for each Automation Type