a voltage phasor based fault classification method for …staff.vbi.vt.edu/anam86/a voltage phasor...

21
This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1, 22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at: www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951 1 Fenghua Gao 1,* , James S. Thorp 2 , Shibin Gao 1 , Anamitra Pal 2 , and Katelynn A. Vance 2 1 Southwest Jiaotong University, Chengdu, P. R. China. 2 Virginia Tech, Blacksburg, USA. * Corresponding Author. Address: the Department of Electrical Engineering, Southwest Jiaotong University, Chengdu, 610031, P. R. China. E-mail address: [email protected] (Fenghua Gao). AbstractThis paper presents a fault classification method for transmission lines based on voltage phasors using classification and regression trees (CART). The proposed method is intended to aid system operators in understanding the outputs of a PMU only state estimator. Faults are classified into four categories when the estimator is positive sequence and into ten categories when the estimator is three phase. The fault data is generated in PowerWorld ® and DSA Tools ® . The pre-fault state consists of a variety of operating conditions and loading angles of faulted lines. The fault condition comprises of different fault types, fault locations, fault impedances, and fault incidence angles. The fault classification is done using MATLAB ® . The approach is successfully tested on the IEEE-118 bus system. The results demonstrate that the technique developed here is effective and robust, irrespective of the pre-fault and fault conditions. KeywordsClassification and Regression Tree (CART), Fault Classification, Fault Type, Fault Voltage, Phasor Measurement Units (PMUs), State Estimation, Voltage Phasors. 1. INTRODUCTION In recent years, state estimation [1] has played a key role in energy management systems (EMS), and it has been extensively used to improve power system performance and reliability. The conventional state estimator based on supervisory control and data acquisition (SCADA) system typically processes positive sequence data every two seconds. The conventional state estimator is static because it is based on a data A Voltage Phasor Based Fault Classification Method for PMU Only State Estimator Output

Upload: dotuyen

Post on 28-Mar-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

1

Fenghua Gao1,*, James S. Thorp2, Shibin Gao1, Anamitra Pal2, and Katelynn A. Vance2

1Southwest Jiaotong University, Chengdu, P. R. China.

2Virginia Tech, Blacksburg, USA.

* Corresponding Author. Address: the Department of Electrical Engineering, Southwest Jiaotong University,

Chengdu, 610031, P. R. China. E-mail address: [email protected] (Fenghua Gao).

Abstract— This paper presents a fault classification method for transmission lines based on voltage

phasors using classification and regression trees (CART). The proposed method is intended to aid system

operators in understanding the outputs of a PMU only state estimator. Faults are classified into four

categories when the estimator is positive sequence and into ten categories when the estimator is three phase.

The fault data is generated in PowerWorld® and DSA Tools®. The pre-fault state consists of a variety of

operating conditions and loading angles of faulted lines. The fault condition comprises of different fault types,

fault locations, fault impedances, and fault incidence angles. The fault classification is done using MATLAB®.

The approach is successfully tested on the IEEE-118 bus system. The results demonstrate that the technique

developed here is effective and robust, irrespective of the pre-fault and fault conditions.

Keywords— Classification and Regression Tree (CART), Fault Classification, Fault Type, Fault Voltage,

Phasor Measurement Units (PMUs), State Estimation, Voltage Phasors.

1. INTRODUCTION

In recent years, state estimation [1] has played a key role in energy management systems (EMS), and it

has been extensively used to improve power system performance and reliability. The conventional state

estimator based on supervisory control and data acquisition (SCADA) system typically processes positive

sequence data every two seconds. The conventional state estimator is static because it is based on a data

A Voltage Phasor Based Fault Classification Method for

PMU Only State Estimator Output

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

2

scan which assumes that the system does not change during the scan. It is also iterative because the

measurements of real and reactive power flows and injections are nonlinear functions of the system state

(bus voltages). Synchronized phasor measurement units (PMUs) [2] have been added to the conventional

state estimator [3], but the resulting hybrid estimators are also nonlinear and the static assumption is still

required. However, as more and more PMUs are installed in the power grid, a purely PMU based state

estimator becomes realistic. A PMU only state estimator is linear and non-iterative and can quickly track

the state of the power system. Therefore, it is potentially more accurate than the conventional state

estimator [4]. Since it is customary for PMUs to be placed on the high voltage buses, it is expected that a

PMU only state estimator will be limited to the high voltage network (345kV and above).

Fault classification for relays is an area in which much research has been conducted over the past few

decades [5]. Papers written on this topic have used methods ranging from fuzzy logic [6] and wavelet

transforms [7-8], to neural networks [9] and support vector machines (SVMs) [10]. The methodology

presented in this paper distinguishes fault types based on only the voltage phasor which is different from

traditional algorithms that required both voltages and currents. This paper is not intended for relaying

applications, but it seeks to aid the operator in understanding the new displays obtained from a PMU only

state estimator. Recently, the first three phase PMU-only state estimator has been developed for

Dominion Virginia Power (a utility of the US) and is currently in operation [4]. In this new estimator, the

operator will see the output once a second under normal conditions. The operator will also have the option

of pushing a button and seeing one second’s worth of data obtained at every cycle. As a consequence of

this fast sampling, faults, high speed recloses, and backup operations will be successfully captured by this

display. However, the problem that is faced by the operators using the outputs of this estimator is that the

array of complex voltages that it generates (corresponding to faults, high speed recloses, etc.) cannot be

distinguished visually. Since, a display at this time scale (one frame per cycle) will be new for the

operator, without proper explanation of this technology, the implementation and acceptance of PMU only

state estimators will be delayed. The aim of this paper is to address this concern by classifying faults

identified by PMU only state estimators using only the voltage phasor.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

3

In this paper, two versions of the PMU only estimator are considered, namely, positive sequence state

estimator and three phase state estimator. Using voltage phasors obtained from these two estimators,

faults are classified into different types. The fault data is generated in PowerWorld® [11] and DSA Tools®

[12]. The pre-fault state consists of a variety of operating conditions and loading angles of faulted lines.

The faults are simulated on the high voltage lines with the fault condition comprising of different fault

types, fault locations, fault impedances, and fault incidence angles. Based on the fault data, a MATLAB®

[13] implementation is then created to perform the fault classification. The results show that the proposed

approach is an extremely accurate and highly robust method for fault classification, and that it is not

affected by the fault condition and the pre-fault state.

The rest of the paper is organized as follows: In Section 2, a model system is introduced that illustrates

how fault voltages can be obtained from PMUs. The analysis design for fault classification as well as the

different fault types is also explained in this section. Section 3 gives a basic background of CART [14]

and explains the logic that is followed for implementing the proposed algorithm. The results of the

positive sequence and three phase state estimator outputs for the test system (IEEE-118 bus system [15])

are summarized in Section 4. In Section 5, decision making using the proposed approach is compared

with two other algorithms – the first one considers only voltage magnitudes, while the second one

considers voltage magnitudes and angles separately. The comparison is done with respect to complexity,

precision, and error tolerance of inputs. The conclusion is provided in Section 6.

2. SYSTEM DESIGN FOR FAULT ANALYSIS

A. Model system

A four bus, three generator system is used to demonstrate fault analysis using only complex voltages.

The single line diagram of the system is given in Figure 1 [5]. The lines are designed to emulate 300 km,

345kV transmission lines (transposed) with PMUs installed at every bus. The lines are modeled as three

phase exact pi sections for the appropriate line length and voltage level. Figure 2 and 3 depict positive

sequence and three phase outputs (voltage phasors) obtained for a typical system state at fault condition.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

4

Line 1

Line 2

G1

G2

G3Line 3

1

2

3

4

Figure 1. One-line diagram of four bus model system

Figure 2. Positive Sequence Voltage phasor at Bus 3 for b-g fault in Line 3

Figure 3. Three Phase Voltage phasor at Bus 3 for b-g fault in Line 3

-0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1100

150

200

250Magnitude for Voltage Phasor

Time (s)

Mag

nit

ude

(kV

)

-0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1120

125

130

135Angle for Voltage Phasor

Time (s)

An

gle

(deg

)

-0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1100

150

200

250Magnitude for Voltage Phasors

Time (s)

Mag

nit

ude

(kV

)

a

b

c

-0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1-200

-100

0

100

200Angle for Voltage Phasor

Time (s)

An

gle

(deg

)

a

b

c

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

5

When a phase b to ground (b-g) fault is applied at 40% of Line 3 with 30o fault incident angle and 1Ω

fault resistance, the positive sequence and three phase complex voltages measured at Bus 3 are depicted

in Figure 2 and Figure 3, respectively. It is clear that the positive sequence voltage (in Figure 2) and

Phase b voltage (in Figure 3) decrease rapidly when the fault occurs. Similarly, it can be inferred that

faults in other phase/s and line/s will be reflected in their corresponding voltage plots [16]. In the

following sub-sections, these complex voltages will be extracted and used to classify different types of

faults.

B. Analysis design

The complex voltages obtained as outputs of the PMU only state estimator are sampled as fast as once

per cycle. In this paper, it has been assumed that relays take at least 1 cycle to detect the fault, whereas the

breakers take at least 2 cycles to open after the relay has detected the fault. Therefore, the second and

third data samples will capture the steady state fault data. Now, there is a possibility that the fault might

not exactly coincide with the beginning of a cycle (it might occur at half-cycle, three quarters of a cycle,

etc.). Taking all this into account, it can be ascertained that the third voltage sample after the fault is the

best sample to be used as the steady state fault voltage.

The flow diagram of the fault analysis process is given in Figure 4. It starts with the outputs of a PMU

only state estimator. The outputs are the voltage phasors that are fed into the fault detector. The fault

detection algorithm is based on the dynamic state prediction method developed in [17]. If there is a fault,

the fault detector provides the voltage phasors of the terminal ends of the faulted line to the fault classifier

which then determines the fault type.

Fault detector

Fault classifier

PMU only state

estimator output

Voltage phasor

Fault type

Figure 4. The flow-process of fault analysis based on PMU only state estimator output

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

6

C. Fault classification types

For a positive sequence state estimator output, short circuit faults were classified into four types,

namely single-line to ground faults, line-to-line faults, double-line to ground faults, and three phase faults.

For a three phase state estimator, ten fault types are considered, namely single-line to ground faults on

Phase a, single-line to ground faults on Phase b, single-line to ground faults on Phase c, line-to-line faults

on Phase a and b, line-to-line faults on Phase b and c, line-to-line faults on Phase a and c, double-line to

ground faults on Phase a and b, double-line to ground faults on Phase b and c, double-line to ground faults

on Phase a and c, and three phase faults. CART has been used in this paper to identify the fault type. A

variety of fault conditions are simulated in PowerWorld® and DSA Tools® to generate the data. After

post-processing of the data, it is used as input to the MATLAB implementation of CART for decision

making. It is expected that the proposed technique will provide operators with a label for the plots that the

PMU only state estimator generates.

3. ALGORITHM SELECTION AND IMPLEMENTATION FOR FAULT CLASSIFICATION

A. Classification and regression tree (CART)

Decision tree is a machine-learning tool that extracts information from large sets of data and constructs

a prediction model based on the gained information. A classification and regression tree (CART) is a non-

parametric decision tree learning technique that intuitively represents the gained knowledge in the form of

if-else statements. It extracts a manageable number of core measures from the database which is then used

to classify the data. As illustrated in [18], decision trees are widely used in power systems, with CART

being the most popular tool because of its high precision and robustness. Typical examples of decision

tree applications are adaptive protection schemes [19], online monitoring schemes [20], transient stability

analysis [21], control analysis [22], etc. A commercial implementation of CART (CART®) has also been

developed by Salford Systems [23].

The objective of CART is to perform an optimal split for a given problem. The splitting variables

specify the required measurements while the structure of the tree determines how the new data will be

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

7

sorted. For classification purposes, CART takes all variables from the given data set and picks suitable

splitting variables for performing optimal splits. If 𝑝 is a list of attributes and 𝑑 denotes the splitting

variables that have been picked, then this logic can be referred to as “𝑝 chooses 𝑑” [24]. However,

synchrophasor data consists of complex numbers (complex voltages, currents, etc.). The traditional CART

algorithm considers the two components of a complex number (real and imaginary) separately and picks

one of them at a time for decision making. As demonstrated in [25], this may not always produce the most

optimal results. An algorithm called Fisher’s linear discriminant applied to synchrophasor data (FLDSD)

was developed in [25] for applying CART to high dimensional multivariate data. When two classes are

considered, FLDSD represents multi-dimensional data points by a one-dimensional variable. This

variable is the distance from an optimally selected hyperplane. When multiple classes are present, the

process is repeated by taking two classes at a time. If n denotes the number of classes, then the maximum

number of hyperplanes needed to split the data is given by (1).

Number of hyperplanes =n(n − 1)

2 (1)

It can be easily inferred that the number of variables produced is equal to the number of hyperplanes

formed. CART then picks those one-dimensional variables which are needed to perform an optimal split.

More details about CART and its operations can be found in [14], [23], and [26].

B. Algorithm implementation for fault classification

The voltage based fault classification method introduced in this paper has been analyzed and tested on

the IEEE-118 bus system [15]. PMUs are placed on the eleven 345kV buses to monitor the ten high-

voltage lines. The outputs are generated as fast as once per cycle. For each of the lines, four kinds of

faults are applied to generate positive sequence fault data, whereas ten kinds of faults are applied to

generate three phase fault data. For both the estimators, the fault voltage of the estimator output is chosen

as the input to train the fault classification tree. CART is used to pick the measurements to classify the

fault. The complex measurements (voltage magnitudes and angles) from the terminal ends of the lines are

set as inputs to CART for fault classification. This means that CART uses two magnitudes and two angles

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

8

for the positive sequence state estimator output, and six magnitudes and six angles for the three phase

state estimator output. Thus, the data points become four dimensional for the positive sequence state

estimator and twelve dimensional for the three phase state estimator. Since the input data to CART are

complex numbers, FLDSD is used to represent them as one-dimensional variables.

The MATLAB implementation of CART has been used in this paper to categorize the fault type.

Figure 5 depicts the tree-growing logic that was developed. The complex voltage measurements obtained

from PMUs are represented by one-dimensional distance variables. These distance variables then become

the inputs for CART. The training data is initially used to create a full and un-pruned classification tree.

Next, this tree is pruned to obtain the smallest tree (T) such that the error is within one standard deviation

of the minimum cost sub-tree. If the desired accuracy is obtained, then this tree is the final fault

classification tree. If the desired accuracy is not obtained, then the Tree T is pruned one level at a time to

obtain a family of possible classification trees with different tree sizes. The tree with the smallest number

of terminal nodes that satisfies the accuracy criterion is then selected to be the final fault classification

tree.

4. SIMULATION RESULTS

The total number of training cases for the positive sequence state estimator was 7,600, whereas it was

11,400 for the three phase state estimator. These cases were created by considering different fault

locations, fault impedances, and adding noise to voltage phasors in PowerWorld®. For the simulations,

fault locations ranged from 5% to 95% of the line in steps of 5%. Fault impedances were set at 0 and 10

Ohms for the positive sequence state estimator, and 0, 10, and 100 Ohms for the three phase state

estimator. The loading angles changed from -12° to 7° for different faulted lines. 1% fixed error was

introduced in the three phase PMU measurements. White noise of 1% was then added to all the

measurements. The cases with fault locations varying from 5% to 95% in steps of 10% were used to train

the trees while the remaining cases were used to test the trees.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

9

Grow classification tree with maximum possible number of terminal nodes

Choose the tree (T) within one standard deviation error of the minimum cost sub-tree

Desired accuracy

obtained?

Prune the Tree (T) one level at a time to obtain a family of possible classification trees

Test performance of the family of trees and choose the smallest tree with desired accuracy

Final Fault Classification Tree

Yes

No

Set the distance variables as inputs to CART

Use FLDSD to compute one-dimensional distance variables from complex PMU measurements

Figure 5. Implementation of the proposed algorithm (in MATLAB®)

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

10

A. Positive sequence state estimator output

For this case, the faults are partitioned into four groups: single-line to ground fault (LG), line-to-line

fault (LL), double-line to ground fault (LLG), and three phase fault (LLL). The tree was first constructed

based on the training data from the faulted line, which means that one tree is created to perform the fault

classification for one line. Taking line 30-38 as an example, the desired fault classification tree is shown

in Figure 6. Table I summarizes the splitting variables and their corresponding values for the decision

trees of all the lines. From the table, it can be inferred that only two splitting variables (d1-2 and d2-3) are

required for classifying the fault type for this system. The percentage accuracy for all the lines was found

to be 100%. Thus, no misclassifications were made while classifying faults based on positive sequence

state estimator outputs. To further test the robustness of the algorithm, more simulations were performed

in DSA Tools®. These included linear, quadratic, and random load increases, as well as changes in fault

incidence angles. For the ten faulted lines, fault incidence angles were varied from 0° to 90° with the fault

locations selected randomly from the interval [10%, 90%] of each line. Four types of faults were

simulated to generate a total number of 2,880 cases. The accuracy of fault classification using this

approach was still found to be 100%.

Figure 6. Fault classification tree for line 30-38 based on positive sequence voltage phasors

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

11

TABLE I

SPLITTING VARIABLES AND THEIR VALUES FOR POSITIVE SEQUENCE STATE ESTIMATOR OUTPUT

Faulted

Line

S1 S2 S3

Splitting

Variable

Splitting

Value

Splitting

Variable

Splitting

Value

Splitting

Variable

Splitting

Value

8-9 d1-2 0.0007 d1-2 -0.1495 d1-2 -0.3632

8-30 d1-2 -0.0003 d1-2 -0.1448 d1-2 -0.3414

9-10 d1-2 0.0007 d1-2 -0.1817 d1-2 -0.4384

26-30 d1-2 -0.0005 d1-2 -0.0934 d1-2 -0.2099

30-38 d1-2 0.0003 d1-2 -0.1645 d1-2 -0.3925

38-65 d1-2 -0.0047 d2-3 -0.0028 d1-2 -0.1313

63-64 d1-2 0.0002 d1-2 -0.1677 d1-2 -0.3995

64-65 d1-2 -0.0013 d1-2 -0.1813 d1-2 -0.4252

65-68 d1-2 0.0005 d1-2 -0.1937 d1-2 -0.4601

68-81 d1-2 -0.0012 d1-2 -0.1668 d1-2 -0.3951

B. Three phase state estimator output

The three phase state estimator tracks all three phases and provides more detailed fault information

than the positive sequence state estimator output. Based on the fault voltage of three phase state estimator

output, the faults are classified into ten categories: single-line to ground faults on Phase a (LGa), single-

line to ground faults on Phase b (LGb), single-line to ground faults on Phase c (LGc), line-to-line faults

on Phase a and b (LLab), line-to-line faults on Phase b and c (LLbc), line-to-line faults on Phase a and c

(LLca), double-line to ground faults on Phase a and b (LLGab), double-line to ground faults on Phase b

and c (LLGbc), double-line to ground faults on Phase a and c (LLGca), and three phase faults (LLL). By

constructing one tree for all faulted lines, 45 hyperplanes are formed resulting in 45 distance variables

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

12

being set as inputs to CART. In accordance with the implementation of the proposed algorithm as

summarized in Section 3B, the smallest tree with ten terminal nodes is selected as the final fault

classification tree. This tree is presented in Figure 7. From the tree, it becomes clear that only 9

hyperplanes out of 45 are required to perform an optimal split. The percentage accuracy obtained by using

this tree is 100% for the training cases and 99.981% for the testing cases. In the case of an actual event,

the nine distances of the data points from the corresponding hyperplanes have to be computed. Once these

distances are known, by following the tree to the respective terminal node, the fault type can be identified.

Figure 7. Fault classification tree for all ten high voltage lines based on three phase voltage phasors

5. ALGORITHM COMPARISON

In this section the performance of the proposed algorithm is compared with two other algorithms. The

first algorithm uses only voltage magnitudes to perform fault classification [27], while the second

algorithm classifies faults by considering voltage magnitudes and angles separately (traditional CART

algorithm). In both the algorithms, the IEEE-118 bus system is used as the test system with one tree being

created for one line for the positive sequence state estimator output, while one tree is created for all the

lines for the three phase state estimator output. The following sub-sections summarize the results obtained.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

13

A. Result comparison with the algorithm that considers only voltage magnitudes for fault classification

From Figures 2 and 3 one can hypothesize that the voltage magnitude alone can be used for classifying

different types of faults. To test this logic, an algorithm was developed in [27] that used only voltage

magnitudes for fault classification. Using the IEEE-118 bus system as the test system and incorporating

similar error and white noise percentages, the precision of the trees created using only voltage magnitudes

is compared with the trees obtained using the proposed approach. Table II shows the trees with the best

and worst performances for the positive sequence state estimator output. These performances were picked

from the ten trees created for the ten lines for the two algorithms. It was found that the tree for line 38-65

had the worst performance (93.06% test accuracy) for the algorithm using voltage magnitudes alone,

whereas faults on all ten lines were identified with 100% accuracy by treating the voltage phasors as a

single entity as was done using the proposed approach. It is to be noted here that the worst performance

was based on the complexity of the tree (more number of terminal nodes and higher tree depth) and its

resulting lower accuracy. For the three phase state estimator output shown in Table III, two trees with

different number of terminal nodes were created for the algorithm using only voltage magnitudes to show

how the error counts for the trees decrease as the number of terminal nodes were increased. However,

from this table also it becomes clear that the trees created using the proposed algorithm are simpler,

smaller and more accurate in comparison with the ones obtained using the algorithm based on voltage

magnitudes alone.

To test the effect of white noise on both the algorithms, the white noise percentage was increased from

0% to 20% in steps of 1%. The training and testing of the trees was repeated for all 21 cases (0% to 20%)

for both the algorithms. Figures 8 and 9 depict the results obtained. The stars denote percentage

accuracies for the proposed algorithm while the circles denote the same for the algorithm that uses voltage

magnitudes alone. From the plots, it can be inferred that as the percentage of white noise is increased, the

accuracy of the proposed algorithm continues to remain high while the one that uses voltage magnitudes

alone drops considerably. Based on these results, it is realized that the trees generated using the proposed

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

14

algorithm are more accurate, reliable and resilient to errors in inputs than the algorithm that uses only

voltage magnitudes.

TABLE II

COMPARISON OF ALGORITHM PERFORMANCE FOR POSITIVE SEQUENCE STATE ESTIMATOR OUTPUT

Algorithm

The Tree with the Best Performance The Tree with the Worst Performance

Number

of

Terminal

Nodes

Tree

Depth

Learning

Accuracy

(%)

Testing

Accuracy

(%)

Number

of

Terminal

Nodes

Tree

Depth

Learning

Accuracy

(%)

Testing

Accuracy

(%)

Voltage

Magnitude

4 3 100 100 10 8 98.50 93.06

Proposed

Approach

4 3 100 100 4 3 100 100

TABLE III

COMPARISON OF ALGORITHM PERFORMANCE FOR THREE PHASE STATE ESTIMATOR OUTPUT

Algorithm

Number of

Terminal

Nodes

Tree

Depth

Learning

Accuracy (%)

Testing

Accuracy (%)

Error Counts for

11400 cases

Voltage

Magnitude

10 6 99.35 99.296 77

16 8 99.983 99.963 3

Proposed

Approach

10 5 100 99.981 1

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

15

Figure 8. The average accuracy of independent testing cases for positive sequence state estimator

Figure 9. The accuracy of independent testing cases for three phase state estimator

B. Result comparison with the algorithm that considers voltage magnitudes and angles separately for

fault classification

The traditional CART algorithm considers each attribute separately for performing a split. Therefore, a

complex number is treated as two separate real numbers. When this logic is used for fault classification

purposes for a positive sequence state estimator, four measurements (mag_end1, ang_end1, mag_end2,

ang_end2, denoting the voltage magnitudes and angles from the terminal ends of the lines) become inputs

to CART. For the IEEE-118 bus system, ten trees are constructed for the ten high voltage lines present.

0 5 10 15 2090

92

94

96

98

100

Percentage of the white noise (%)

Accu

racy

(%)

0 5 10 15 2088

90

92

94

96

98

100

Percentage of the white noise (%)

Accu

racy

(%)

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

16

Figure 10 depicts the tree obtained for line 30-38 with mag_end1, mag_end2, and ang_end2 obtained as

the splitting variables. Thus, three attributes (voltage magnitude of bus 30 and bus 38, and voltage angle

of bus 38) are picked by the traditional CART algorithm to classify the fault. Now, on comparing the tree

in Figure 10 with that in Figure 6, it is realized that by using the proposed approach one attribute is

required to classify the fault while three attributes was required by the traditional CART approach. Thus,

by treating complex PMU data as a single entity, lesser number of attributes yielded higher accuracy rates

(as seen in Table IV which compares misclassification rates for the two approaches).

A similar process is followed for the three phase state estimator output for which the input consists of

twelve parameters (three phase voltage magnitudes and angles for the two terminal ends). The resulting

tree is shown in Figure 11. On comparing Figure 11 with Figure 7 and using Table IV, it is realized that

with same number of measurements, a higher accuracy is obtained using the proposed approach. Thus, the

proposed algorithm is superior to the traditional CART algorithm for both the positive sequence as well

as the three phase state estimators. To conclude, the comparisons outlined in the last two subsections

strongly support performing a preprocessing step which will allow CART to use a single entity as the

voltage phasor input. Additionally, the proposed algorithm successfully demonstrates that it can

accurately identify fault types by evaluating the outputs of a PMU only state estimator.

Figure 10. Fault classification tree for line 30-38 for positive sequence state estimator when considering

magnitudes and angles separately

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

17

Figure 11. Fault classification tree for three phase sequence state estimator when considering magnitudes

and angles separately

TABLE IV

COMPARISON OF ALGORITHM PERFORMANCE FOR PMU ONLY STATE ESTIMATOR OUTPUT

Scenario

Using Traditional CART approach Using Proposed Algorithm

Number of Nodes

Misclassification

Rate (%)

Number of Nodes

Misclassification

Rate (%)

Positive Sequence

State Estimator

4 or 5 2.53 4 0

Three Phase State

Estimator

10 1.50 10 0.0088

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

18

6. CONCLUSION

This paper presents a voltage phasor based fault classification technique for transmission lines that

uses CART and is intended for labeling the fault type for system operators who will be seeing the outputs

of a PMU only state estimator. The novelty of this paper lies in the fact that unlike other fault

classification techniques that need currents as well as voltages the proposed approach can successfully

classify faults using only the voltage phasor (obtained directly from the outputs of a PMU only state

estimator). The proposed algorithm handles two categories of voltage phasor data (positive sequence and

three phase). PowerWorld® and DSA Tools® were used for data generation process, while MATLAB®

was used for the classification process. Using positive sequence voltage phasors, faults were classified

into four types, whereas three phase voltage phasors were able to classify faults into ten types.

The training and testing data were generated under a variety of system conditions. The precision of the

algorithm was evaluated by applying it on completely independent data sets. The robustness of the

algorithm was validated by changing different parameters (pre-fault and fault conditions) and adding

white noise to the inputs. The parameters that were varied include the fault impedance, the fault incidence

angle, the fault location, the loading angle of the line, and the load trajectory before and/or during the

fault. Compared with the algorithm that uses only voltage magnitude and the one that treats voltage

magnitude and angle separately, the tree created using the proposed approach was found to classify short

circuit faults with higher accuracy. Simulation results indicate that the proposed technique is highly

effective, robust, and easily applicable to large power system networks.

7. ACKNOWLEDGEMENTS

The authors gratefully acknowledge the financial support provided by the Lockheed Martin

Corporation. Special thanks also go to the faculty members and students of the Power Lab at Virginia

Tech for their valuable suggestions and useful discussions. The authors would like to especially thank Dr.

Jaime De La Ree and Dr. Victor Ortiz for their generous help.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

19

8. REFERENCES

[1] A. Abur, and A. G. Exposito, Power System State Estimation: Theory and Implementation. New

York: Marcel Dekker, 2004.

[2] A. G. Phadke, and J. S. Thorp, Synchronized phasor measurements and their applications. Springer

Science and Business Media, 2008, pp. 141-144.

[3] M. Zhou, V. A. Centeno, J. S. Thorp, and A. G. Phadke, “An Alternative for Including Phasor

Measurements in State Estimators,” IEEE Trans. Power Syst., vol. 21, no. 4, pp. 1930-1937, Nov.

2006.

[4] K. D. Jones, J. S. Thorp, and R. M. Gardner, “Three-phase linear state estimation using phasor

measurements,” in Proc. IEEE Power Eng. Soc. Gen. Meeting, Vancouver, BC, Canada, pp. 1-5, 21-

25 Jul. 2013.

[5] A. G. Phadke, and J. S. Thorp, Computer Relaying for Power Systems (2nd Ed.). John Wiley, 1988,

pp. 96-98, 144, 166-174.

[6] O. A. S. Youssef, “Combined fuzzy-logic wavelet-based fault classification technique for power

system relaying,” IEEE Trans. Power Del., vol. 19, no. 2, pp. 582-589, Apr. 2004.

[7] K. M. Silva, B. A. Souza, and N. S. D. Brito, “Fault detection and classification in transmission lines

based on wavelet transform and ANN,” IEEE Trans. Power Del., vol. 21, no. 4, pp. 2058-2063, Oct.

2006.

[8] Z. He, L. Fu, S. Lin, and Z. Bo, “Fault detection and classification in EHV transmission line based on

wavelet singular entropy,” IEEE Trans. Power Del., vol. 25, no. 4, pp. 2156-2163, Oct. 2010.

[9] T. Dalstein, and B. Kulicke, “Neural network approach to fault classification for high speed

protective relaying,” IEEE Trans. Power Del., vol. 10, no. 2, pp. 1002-1011, Apr. 1995.

[10] P. K. Dash, S. R. Samantaray, and G. Panda, “Fault classification and section identification of an

advanced series-compensated transmission line using support vector machine,” IEEE Trans. Power

Del., vol. 22, no. 1, pp. 67-73, Jan. 2007.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

20

[11] PowerWorld, PowerWorld Corporation. http://www.powerworld.com/

[12] DSA Tools, Powertech Labs Inc., Surrey, British Columbia, Canada. http://www.DSATools.com/

[13] MATLAB version 7.10.0, (computer software), The MathWorks Inc., 2010.

[14] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Tree.

Wadsworth & Brooks/Cole Advanced Books & Software, Pacific California, 1984.

[15] IEEE-118 Bus System. [Online]. Available:

http://www.ee.washington.edu/research/pstca/pf118/pg_tca118bus.htm

[16] S. M. Deshmukh, B. Dewani, and S. P. Gawande, “A review of power quality problems-voltage

sags for different faults,” Int. J. Sci. Eng. Tech., vol. 2, no. 5, pp. 392-297, May 2013.

[17] F. Gao, J. S. Thorp, A. Pal, and S. Gao, “Dynamic state prediction based on auto-regressive (AR)

model using PMU data,” in Proc. IEEE Power and Energy Conference at Illinois (PECI),

Champaign, IL, pp. 1-5 Feb. 2012.

[18] I. H. Mori, “State-of-the-art overview on data mining in power systems,” in Proc. IEEE Power

Eng. Soc. General Meeting, Montreal, QC, Canada, pp. 1-5, 18-22 Jun. 2006.

[19] E. E. Bernabeu, J. S. Thorp, and V. A. Centeno, “Methodology for a security/dependability

adaptive protection scheme based on data mining,” IEEE Trans. Power Del., vol. 27, no. 1, pp. 104-

111, Jan. 2012.

[20] R. F. Nuqui, A. G. Phadke, R. P. Schulz, and N. Bhatt, “Fast on-line voltage security monitoring

using synchronized phasor measurements and decision trees,” in Proc. IEEE Power Eng. Soc. Winter

Meeting, Columbus, OH, vol. 3, pp. 1347-1352, 28 Jan.-1 Feb. 2001.

[21] M. Li, A. Pal, A. G. Phadke, and J. S. Thorp, “Transient stability prediction based on apparent

impedance trajectory recorded by PMUs,” Int. J. Elect. Power Energy Syst., vol. 54, pp. 498-504, Jul.

2013.

[22] G. Qun, and S. M. Rovnyak, “Decision trees using synchronized phasor measurements for wide-

area response-based control,” IEEE Trans. Power Syst., vol. 26, no. 2, pp. 855-861, May 2011.

[23] Salford Systems, CART. [Online]. Available: http://www.salford-systems.com/cart.php.

This is an Author's Accepted Manuscript of an article published in Electric Power Components and Systems, 43:1,

22-31, Date of Online Publication: 20 Nov 2014, Copyright: Taylor & Francis, available online at:

www.tandfonline.com/doi/abs/10.1080/15325008.2014.956951

21

[24] S. K. Murthy, S. Kasif, and S. Salzberg, “A system for induction of oblique decision trees,” J.

Artific. Intell. Res., vol. 2, no. 1, pp. 1 -32, 1994.

[25] A. Pal, J. S. Thorp, T. Khan, and S. S. Young, “Classification trees for complex

synchrophasor data,” Elect. Power Compon. Syst., vol. 41, no. 14, pp. 1381-1396, Sep. 2013.

[26] S. Dan, and P. Colla, CART: Tree-Structured Non-Parametric Data Analysis. San Diego, CA:

Salford Systems, 1995.

[27] F. Gao, J. S. Thorp, S. Gao, A. Pal, “Fault analysis based on PMU only state estimator,”

submitted to Automation of Electric Power Systems.