06549589

Power Quality Data Evaluation in Distribution Networks Based on Data Mining Techniques

Tongyou Gu*, P. Kadurekt, J. F. G. Cobbent, A. W. Endhoven* * Alliander N.V., Netherlands

[email protected] tEindhoven University of Technology, Netherlands

Abstract-With the increasing amount of data available from the harmonic monitoring systems in the distribution grid, it is becoming more important to evaluate the harmonic data.

This paper presents an algorithm using data mining technique, in particular mixture modeling based on the Minimum Message Length (MML) method, to classify the harmonic data into clusters and identify useful patterns within the data. The resulted clusters are applied to distinguish the sources of disturbances and the time schedule of the disturbances in the distribution grid. In addition, the CS.O algorithm as a supervised learning is used to produce rules about how the measured data is classified into various clusters using decision tree technique. These generated rules can then be utilized to predict which cluster any new data belongs to without calculating again.

Index Terms-clustering methods, data mining, power distribution, power system harmonics, power quality

I. INTRODUCTION

Nowadays, the customers are using more and more devices

based on power electronics, which are more vulnerable to the

power quality (PQ) distortion, in particular to the harmonic

distortion in voltage supply waveform. Distribution system

operator (DSO) has to provide certain quality of voltage supply

to customers and decrease losses in the electricity grid [1].

In order to meet the requirements of DSO and customers, the

amount of available measurements among the distribution grid

is increasing, which provides numerous PQ measurements [2].

The large amounts of PQ data hold more information than

that reported using classical techniques for PQ monitoring.

Data mining method extracts the hidden information that might

be critical for identification and diagnoses of power quality

disturbances and prediction of system abnormalities or failures

[3]. Therefore, advanced evaluation of large volumes of PQ

data is an important task.

Data mining tools are an obvious candidate for analyzing

large scale data. Data mining can be understood as a process

that uses a variety of analytical tools to identify hidden patterns

within data. One important tool is clustering, which is a sort of

classification and often used to gain patterns and anomalies in

multivariate data. Once classified, another tool named decision

tree technique can be used to build rules or relationships

between the input data and the classes [4].

The aim of this paper is to present a methodical approach

using data mining techniques to classify harmonic data into

different clusters, which can assist to analyze large volumes

of harmonic measurements and extract hidden information. In

this paper the following questions are addressed, how:

• To develop an algorithm for classifying large amount

of harmonic data into clusters using Minimum Message

Length (MML) technique (Section 11). • To apply the MML algorithm to measured data from

computer simulation of harmonic events in distribution

grid (Section Ill).

• To determine and distinguish of disturbance sources in

the distribution grid with clusters obtained from the MML

algorithm (Section IV).

• To further interpret the obtained clusters and to generate

rules about the occurrence of cases using decision tree

technique (Section IV).

• To propose the requirements of measurements, in order

to improve the accuracy of the result (Section IV).

11. DATA MININ G

A. Data Mining in Power Quality Data Analysis

Data mining has popularity in many research fields [5].

It is also a preferred candidate for assisting in the analysis

of power system. Essentially applying data mining tools to

power quality data provides the ability to identify the various

underlying contexts associated with the sites monitored, and

power quality disturbances of interest.

There are two important learning strategies in data min

ing technique: Supervised Learning (SL) and Unsupervised

Learning (USL). Unsupervised learning is applied to discover

a number of pattern labels, subsets, or segments within the

data, without any prior knowledge of the target classes. In

supervised learning, each instance of data is mapped to align

with its associated label in order to find the interpretation of

these pattern labels [6].

B. Clustering Technique as Unsupervised Learning

Clustering is a process that segments or divides an initially

unlabeled collection of data with various attributes into a

certain number of groups or clusters [7]. As a result, the

data grouped in each cluster are similar, whereas data across

different clusters are dissimilar. Clustering can be considered

as a learning process, and a method for analyzing large

volumes of data which is hard to analyze as a whole.

978-1-4673-3059-6113/$31.00 ©20l3 IEEE

There are a variety of clustering algorithms. As Mixture

Modeling method can process a large amount of data and has

a better accuracy, it is chosen to be applied in this paper [7].

1) Mixture Modeling Method [8J: Mixture Modeling can

be described as an unsupervised learning method which con

structs a model based on a mixture of statistical distributions

that have been learned from the data. The Gaussian distribution

is the most commonly used distribution in the mixture models.

The two main steps used in the Mixture Modeling method

are parameter estimation and model selection. In order to

identify a suitable model, the distribution parameters that

could have plausibly generated the same data values should

be estimated first.

In a single distribution the mean and variance that makes the

data most likely to have arisen from this distribution, can be

calculated by ditlerentiating the probability of all data points,

and equating the result to minimum. However, in the case of

mixture distributions it is hard to calculate the probability of

each data as it is not known which distribution has generated

which data point. The expectation and maximization (EM)

algorithm can solve this problem [9].

First, in the expectation step, initial values of distribution

parameters are selected, and the probabilities of each point,

with respect to their distribution, are calculated from Bayes

rule. Second, the calculated probabilities are then used to

estimate the distribution parameters.

2) Minimum Message Length (MML) Technique in Mixture

Mode ling: MML technique is an important application of

Mixture Modeling method for data segmentation. As the name

implies, MML evaluates models according to their ability to

compress a message containing data.

Compression methods generally attain high densities by

formulating efficient models of the data to be encoded. The

encoded message consists of two parts. The first part describes

about the model and the second describes about the observed

data given that model. The total encoded message length is

then calculated and the best model (shortest total message

length) is selected.

Autoclass, which is a data mining software tool developed

by National Aeronautics and Space Administration (NASA),

has been chosen as the Mixture Modeling software used in

this paper.

C. Decision Tree as Supervised Learning

Once clusters are generated, supervised learning is usually

used to map each instance of data to align with its associated

class. Decision tree is one example of supervised learning

technique. With the decision tree, a model is generally auto

matically induced or built, based on some guiding information

and theoretical metric such as entropy gain [10]. It proposes

plausible relationships between the input data (training set)

and the classes, here being cluster labels obtained from the

unsupervised learning of MML. Once the model is trained, it

can then be applied to other data sets (future data). In doing

so, it is possible to predict which class a newcome data point

----1--- 150kV

Feeder 1 Feeder 3 10kV

Site 1 ) Site 2 Site 3

1-1 2-1 3-1

Site 9 '. 'Site 11 Site 10

T2-2 T3-1

3-2 re Site 4 Site 5

1-2 3-3

Load3

Load4 Load?

Load5 Load8

Loadl Load2

Fig. 1. Schematic of a simplified typical MV distribution network in the Netherlands

best belongs to. In this paper, the supervised learning CS.O

algorithm [11] is used to carry out the supervised learning.

Ill. SIMULATION OF HARMONIC EVEN T

A. Simulation Design

In this section, the harmonic measurement data from a

distribution grid, as illustrated in Fig. I, is classified using

Mixture Modeling program Autoclass. The distribution grid is

simulated in Simulink with a simulation time of 20s, and the

waveform is sampled with 128 points per cycle.

As can be seen from Fig. 1, 12 monitors are installed:

one monitor at the substation incoming supply (Site 0); three

monitors at the beginning of the three individual feeders

(Site 1, Site 2, Site 3); and eight monitors at each Point of

Connection (POC) in the load area (Site 4-Site 11).

At POC LoadS, a distortion source is simulated, which

generates Sth and 7th harmonic current in time intervals of

t=Ss-IOs and t=ISs-20s. Near load 7 a capacitor is placed to

compensate reactive power switching On at t=6s and Off at

t=9s.

B. Data Selection

Throughout the simulation, 1000 cycles (20s) are recorded

at frequency of SOHz. After decomposition of each cycle of

the measured waveforms in phase A by FFT, each order of

harmonic currents can be obtained, as well as fundamental

currents.

Three sorts of attributes are selected as input of Autoclass:

fundamental currents, Sth harmonic currents and 7th harmonic

currents. Because most of the 3rd harmonic current is blocked

due to the presence of �/Y transformer, and higher order

harmonics are also not chosen because of their low values.

In sum, 36 attributes consisting of the fundamental currents,

as well as harmonic currents of order 5th and 7th at 12 sites

(Site O-Site 11) have been selected.

Besides, some additional data, such as the reactive power,

can be used to confirm any suspected events identified from

the clustering results. This will be explained in next section.

IV. HARMONIC EVEN T DETECTION

The program Autoclass is applied to the selected 36 at

tributes with an accuracy of measurement (Aom), which in

this example is almost 0% and set to be 0.1 % according

to [12]. By observing how the measured data are classified

into various clusters, power utility engineers can more readily

deduce power quality events that may have triggered a change

from one cluster to another cluster. To confirm the observation,

additional information, such as reactive power can be used.

A. Resulted Clusters

Seven clusters has been obtained and sorted in ascending

order based on the abundance value as shown in Table I. The

first three clusters (SO, SI, and S2) cover most of the cases,

thus these three clusters represent the main steady states of

the power grid. The rest clusters (S3, S4, S5 and S6) have

much lower abundance values. This means they represent the

transient moments. Each case in every generated cluster can be

considered as a profile of 36 variables (fundamental current,

5th and 7th harmonic current at 12 sites).

Fig. 3 illustrates the result of 5th and 7th harmonic current

at all sites in the first three clusters. It can be observed that

the harmonic currents in SO are almost zero. That means this

cluster is generated by cases that have no harmonic distortion.

Fig. 2 gives the distribution of the three main clusters in time

axis, which indicates each cluster contains cases of which time

period. Thus the clean period represented by SO is t=Os-5s and

t= 1 Os-15s.

Meanwhile, a large amount of harmonic currents appear in

SI and S2. In cluster SI, the significant value of harmonic

occurs at Site 0, Site 2 and Site 8. Together with the network

structure, this might be recognized as a route of major har

monic current flowing. Nevertheless, in cluster S2, at another

two sites, Site 3 and Site 10, large value of harmonic is also

found. This may be another possibility of harmonic current

route.

An issue that should also be taken into consideration is

capacitors may be mistaken as harmonic source. The reason is

that the impedance of a capacitor is inversely proportional with

frequency. In normal cases, even a small amount of harmonic

voltage can lead to very high harmonic current through the

capacitor.

So these two probabilities (Site O-Site 2-Site 8 and Site

O-Site 3-Site 10) need to be further analyzed to locate a

harmonic distortion source.

B. Identification of Harmonic Source and Compensation Ca

pacitor

By just observing the harmonic currents, it is difficult to

locate the harmonic source or capacitor. The reactive power

o

TABLE I ABUNDANCE VALUE OF THE SEVEN CLUSTERS GENERATED

Cluster No. SO SI S2 S3 S4 S5

Abundance 0.451 0.349 0. 147 0.027 0.0 1l 0.0 1 1

so 5 6 9 10

Time(s)

SO I 15

Fig. 2. Main result clusters mapping to time axis

SI

S6

0.004

20

as shown in Fig. 4 provides a further insight that capacitor

switching events have happened at Site 3.

The reactive power at Site 3 decreases at t=6s and increases

back at t=9s, which implies the capacitor is switched On and

Off at these two time points. This matches with the time range

of S2 as demonstrated in Fig. 2. Therefore, it can be deduced

that S2 is generated due to the use of the capacitor.

In addition, the cluster that has the lowest abundance is S6.

Its abundance is only 0.04, which means only 4 cases fall in

this cluster (1000 cases in total). Normally small abundance

means abnormal occurrence, which may need further inves

tigation. When mapping the 4 cases in S6 to time axis, it is

found that S6 is at the two ends of S2. By observing the current

waveform at Site 3 in Fig. 5, a capacitor switching event at

around t=6s can be recognized by high frequency resonance.

This confirms the analysis result mentioned before.

In actual conditions, the pattern of reactive power may

not be so obvious, as the main function of capacitor is to

compensate the reactive power when the demand is high.

Thus the value of measured reactive power is expected to

be approximately constant. In this situation, the clusters that

represent transients are determinant to identify capacitors

switch events. This is one of the main advantages of clustering

algorithm that the clusters have less abundance can potentially

signify anomalous cases, and this is of great help to deduce

and analyze the unique or different operating conditions.

To conclude, the location of harmonic distortion source is

at loadS and compensation capacitor is near load7, in addition,

the distorted time period is t=5-IOs and t= 15-20s, with a

capacitor switches in during t=6-9s. These all agree with the

initial settings of the grid. Fig. 6 shows the time periods of

distortion, and Fig. 7 indicates the routes of harmonic current

flow.

C. Rules Generating

Once identify the harmonic distortion source, we may want

to gain a closer insight into cluster SI which represents

cases with harmonic distortion, to understand what would be

the range of the attributes that could form this cluster. The

supervised learning C5.0 tool is applied to the measured data

set and the mixture model generated by Autoclass program.

The generated clusters are used as class labels to the input

data (fundamental current, 5th and 7th harmonic current).

Generally, the data from all the 12 sites can be applied

30

� 25 .... = � 20 ... 8 15

12

4 2 o

10

5

o

..---

/-- -�-- � � .. - �. -� � ...... � .... � ... - ..... -

11 '

11 ' - - - -- --- -- - - -5th 7th 5th 7th 5th l I 1

- - - --

_- - -- _- --7th 5th 7th - - _ -

siteO site 1 · 5th Inh 5th hth 5th 1 1

-

- -

slte2 site3 7th 5th 7th h 1

site4 siteS 5t 7th site6 site7 site8

-

--

- - -w--!1-- - - - - -;Jic ,:,,: ,: 1 .: =1 = .

7th 5th 7th slte9 site 10 site 1 1

Fig. 3. The result of 5th and 7th harmonic currents at Site a-Site 1 1 in cluster 0·2

I-Site21 --Site3 2 4 6 8 Time(s)

Fig. 4. Reactive power at Site 2 and Site 3

� C � ::> ()

10

400 300 200 100 0

-100 -200 -300 -400 5.96 5.98 6 6.02 6.04 6.06 Time(s) 6.08

Fig. 5. Current waveform at Site 3 around the time when the capacitor is switched on(t=6s)

c to generate rules. Since the network structure is simplified,

and only contains one harmonic source and capacitor, here we

only choose data at Site 0 as input to simplify the calculation,

because the current at Site 0 can provide sufficient information

of the whole substation. The discovered rule for cluster S I is

listed as follows: fA r--J

• if 15 > 9.228 • and 17 :::; 7.254 • then 1.000

This rule means that if the 5th harmonic current at Site 0 is

larger than 9228A, and if the 7th harmonic current is smaller

than 7.254A, the case will be classified in SI. This makes

sense because if there exist harmonic sources with order of

5 and 7, the 5th and 7th harmonic currents will be higher

than normaL Furthermore, if the 7th harmonic current becomes

so 1 SI I S2 1 SI 1 so 1 SI I 0 5 6 9 10 15 20

Time(s)

Fig. 6. Time axis of distortion Ch' for harmonic distortion, 'C' for capacitor is on)

exceedingly high due to the capacitor switching in, another

grid operation state will be recognized and consequently

another cluster (in this example S2) will be created. Thus this

150kV

Feeder 1 Feeder 3 �-------------r----�

--------��-- 10kV CC Site 1 ' Site 2 Site 3

1-1 2-1 3-1

Site 7 1 Site 8 Site 9 "Site 11

Site 4 Site 5

1-2

Loadl Load2

Load3

Load4

- Route of harmonic source

Load5

--- Route of capacitor harmonic current

Other routes of harmonic current

Load6

Load8

Fig. 7. Location of harmonic source

rule gives a good criterion of cluster S 1.

Site 10

3-2 re

Load7

The quality of the rules is described by two indices, which

are the number of instances assigned to the rules and the

proportion of correctly classified instances. In this example,

the number of data instances is 1000, with a proportion of

0.998 or 998 instances being correctly classified.

The usefulness of supervised learning using CS.O is that it

can help to analyse unusual events from a large amount of PQ

data. Additionally, suppose the input of CS.O covers almost all

possibilities, then if a new measured data comes, the cluster

that the data should belong to can easily be predicted without

calculating or clustering again.

D. Requirements of Measurement

To use these algorithms in real application, the requirements

of measurement are as follows:

J) Time Scale of the Measurement: The suggested time

interval according to [12] for measurements of harmonic, inter

harmonic and unbalance waveforms is 3 seconds for very short

interval. This is not enough if the detail of transients is needed,

as the duration of a transient moment is usually as short as

only one cycle. In order to provide sufficient information for

data mining, a time scale of maximum 0.02s is required. In

addition, if other power quality problems such as voltage dip

are under consideration, the time scale should be even shorter

since their duration is also shorter.

2) Placement among the Distribution Grid: In order to

locate the source of harmonic distortion among a distribution

network, there are 3 potential measurement locations: the

substation incoming supply; individual feeders; and POC in

the load area.

3) Data of Measurement: Basically, the harmonic current

measurements up to order of SO are required [12]. And

additional measurements are optional, such as reactive power,

which may help to evaluate power quality data and analyses

the cause of power quality events.

V. CON CLUSION

With the increasing amount of data available from the

harmonic monitoring system, it is becoming more difficult to

extract useful information from the large scale of harmonic

monitoring data. This paper has illustrated that the application

of data mining, in particular mixture modeling based on the

Minimum Message Length (MML) method, to power quality

data can identify useful patterns within the harmonic data

obtained from simulations of distribution grid. Each resulted

cluster can represent a specific operating condition such as

capacitor switching operation and harmonic event.

By observing clusters got from the MML algorithm, the

operation status of each time period could be recognized,

including the details of harmonic currents. Thus the sources of

disturbance in the distribution grid can be distinguished, and

the time schedule of the disturbance can be obtained. Other

available data (which is not used in the clustering algorithm)

such as reactive power measurements could be used together

to confirm the observations.

Once the clusters are generated using the MML method, the

CS.O algorithm as a supervised learning is used to examine

how the measured data is classified into various clusters and

to generate rules about the occurrences of cases using decision

tree technique. These generated rules can then be utilized

to predict which cluster any new data belongs to without

calculating again.

The proposed technique will be useful for the power quality

analysis in the distribution network and will help the DSOs

fastly recognize and categorize power quality events in the

network. This study also serves as an initial investigation

for Alliander N.Y. (Dutch DSO) suggesting the application

possibility of the data mining technique for evaluation of PQ

data.

REFEREN CES

[1] Voltage characteristics of electricity supplied by public distribution

systems, Nederlandse norm Std. NEN-EN 50 160, 2000. [2] F. Provoost, "Intelligent distribution network design," Ph.D. dissertation,

Eindhoven University of Technology, 2009.

[3] w. W. Dabbs and T. E. Sabin, "Probing power quality data," IEEE Transactions on Computer Applications, vo!. 7, no. 2, pp. 8-14, 1994.

[4] H. Mannila, "Data mining: machine learning, statistics, and databases," in 8th inter. co/if. on scientific and statistical database systems, 1996,

pp. 2-9.

[5] R. Groth, Data Mining: Building Competitive Advantage. USA: Prentice Hall, 2000.

[6] A. T. M. Asheibi, "Discovery and pattern classification of large scale harmonic measurements using data mining," Ph.D. dissertation, University of Wollongong, 2009.

[7] T. Pang, M. Steinbach, and V. Kumar, Introduction to Data Mining. Boston: Pearson Education, 2006.

[8] J. J. Oliver and D. J. Hand, Introduction to minimum encoding inference. UK: Dept.Statistics. Open University, 1994.

[9] G. McLachlan and T. Krishnan, T he EM Algorithm and Extensions. John Wiley & Sons, 1997.

[10] 1. H. Witten and E. Frank, Data mining practical machine learning tools and techniques. San Francisco: Morgan Kalfman, 2005.

[11] J. R. Quinlan, C4.5: Programs for machine learning. Morgan Kaufmann Publishers, Inc, 1993.

[12] Electromagnetic compatibility (EMC) - Part 4-30: Testing and measure

ment techniques - Power quality measurement methods, International electrotechnical commission Std. IEC61 000-4-30, 2008.

06549589

Documents