a multi-layer, hierarchical information

8
IEEE General Meeting 2011-PNNL-SA-77218 Abstract—This paper presents the modeling approach, methodologies, and initial results of setting up a multi-layer, hierarchical information management system (IMS) for the smart grid. The IMS allows its users to analyze the data collected by multiple control and communication networks to characterize the states of the smart grid. Abnormal, corrupted, or erroneous measurement data and outliers are detected and analyzed to identify whether they are caused by random equipment failures, human error, or tampering. Data collected from different information networks are crosschecked for data integrity based on redundancy, dependency, correlation, or cross-correlations, which reveal the interdependency between data sets. A hierarchically structured reasoning mechanism is used to rank possible causes of an event to enable system operators to proactively respond or provide mitigation recommendations to remove or neutralize the threats. The model satisfactorily identifies the cause of an event and significantly reduces the need to process myriads of data. Index Terms—cyber security, reliability, smart grid, predictive defense, interoperability, data integrity. I. INTRODUCTION HIS paper presents a predictive defense information management system (IMS) for the smart grid multi-layer, multi-protocol, multi-purpose information networks to efficiently process myriads of data to evaluate the status of the system, identify failures, predict threats, and suggest remediation. As shown in Fig. 1, smart grid initiatives encourage deployment of digital technology to save energy, reduce cost, and increase reliability. For example, a phasor measurement unit (PMU) collects 30–60 data points per second, much faster than the 1 data point per 12 second sampling rate of the traditional supervisory control and data acquisition (SCADA) system. A smart meter collects data by the minute, while the old metering collects data hourly or monthly. Thus, widely used digital control and communication technologies provide grid operators with an unprecedented amount of information This work is supported by the Pacific Northwest National Laboratory, operated for the U.S. Department of Energy by Battelle under Contract DE- AC05-76RL01830. N. Lu, P. Du, X. Guo, M. Hadley, P. Paulson, and F. Greitzer are with Pacific Northwest National Laboratory, P.O. Box 999, MSIN: K1-85, Richland, WA 99352, USA. Email: [email protected] ; [email protected] ; [email protected] ; [email protected] ; [email protected] ; [email protected] . and allow them to be aware of the status of the massive number of devices connected to the power grid, such as generators, breakers, even appliances in commercial buildings or individual homes. A key characteristic of smart grid is to use real-time information from embedded sensors and automated controls to anticipate, detect, and respond to system problems, and to create the most reliable and efficient electric grid at the least cost to the economy and the least impact on the environment. Fig. 1: The configuration of the smart grid communication and control systems However, because control and monitoring signals are sent out via different networks to many end devices with various vulnerabilities, there are serious cyber security and reliability concerns about the smart grid’s ability to resist attacks and heal itself without damaging infrastructure/equipment or causing large-scale blackouts [1]–[5]. In addition, the massive use of low-cost communication and electronics devices provides an explosion of information that bears different data formats and timestamps, with/without secured information interchange mechanisms. Thus, evolving from a “data rich” system to a “data secure” and “information rich” system to reach the full potential of the smart grid poses significant challenges. Therefore, it is critical to fully utilize the data collection capability provided by the smart grid infrastructure to build a trustworthy system. This work aims to develop a dynamic information processing mechanism for the multi-layer, multi-protocol, and A Multi-layer, Hierarchical Information Management System for the Smart Grid Ning Lu, Pengwei Du, Patrick Paulson, Frank Greitzer, Xinxin Guo, and Mark Hadley T 978-1-4577-1002-5/11/$26.00 ©2011 IEEE

Upload: usama-ahmed

Post on 18-Nov-2015

216 views

Category:

Documents


0 download

DESCRIPTION

smart grids ieee

TRANSCRIPT

  • IEEE General Meeting 2011-PNNL-SA-77218

    AbstractThis paper presents the modeling approach,

    methodologies, and initial results of setting up a multi-layer, hierarchical information management system (IMS) for the smart grid. The IMS allows its users to analyze the data collected by multiple control and communication networks to characterize the states of the smart grid. Abnormal, corrupted, or erroneous measurement data and outliers are detected and analyzed to identify whether they are caused by random equipment failures, human error, or tampering. Data collected from different information networks are crosschecked for data integrity based on redundancy, dependency, correlation, or cross-correlations, which reveal the interdependency between data sets. A hierarchically structured reasoning mechanism is used to rank possible causes of an event to enable system operators to proactively respond or provide mitigation recommendations to remove or neutralize the threats. The model satisfactorily identifies the cause of an event and significantly reduces the need to process myriads of data.

    Index Termscyber security, reliability, smart grid, predictive defense, interoperability, data integrity.

    I. INTRODUCTION HIS paper presents a predictive defense information management system (IMS) for the smart grid multi-layer,

    multi-protocol, multi-purpose information networks to efficiently process myriads of data to evaluate the status of the system, identify failures, predict threats, and suggest remediation.

    As shown in Fig. 1, smart grid initiatives encourage deployment of digital technology to save energy, reduce cost, and increase reliability. For example, a phasor measurement unit (PMU) collects 3060 data points per second, much faster than the 1 data point per 12 second sampling rate of the traditional supervisory control and data acquisition (SCADA) system. A smart meter collects data by the minute, while the old metering collects data hourly or monthly. Thus, widely used digital control and communication technologies provide grid operators with an unprecedented amount of information

    This work is supported by the Pacific Northwest National Laboratory,

    operated for the U.S. Department of Energy by Battelle under Contract DE-AC05-76RL01830.

    N. Lu, P. Du, X. Guo, M. Hadley, P. Paulson, and F. Greitzer are with Pacific Northwest National Laboratory, P.O. Box 999, MSIN: K1-85, Richland, WA 99352, USA. Email: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected].

    and allow them to be aware of the status of the massive number of devices connected to the power grid, such as generators, breakers, even appliances in commercial buildings or individual homes. A key characteristic of smart grid is to use real-time information from embedded sensors and automated controls to anticipate, detect, and respond to system problems, and to create the most reliable and efficient electric grid at the least cost to the economy and the least impact on the environment.

    Fig. 1: The configuration of the smart grid communication and control systems

    However, because control and monitoring signals are sent out via different networks to many end devices with various vulnerabilities, there are serious cyber security and reliability concerns about the smart grids ability to resist attacks and heal itself without damaging infrastructure/equipment or causing large-scale blackouts [1][5]. In addition, the massive use of low-cost communication and electronics devices provides an explosion of information that bears different data formats and timestamps, with/without secured information interchange mechanisms. Thus, evolving from a data rich system to a data secure and information rich system to reach the full potential of the smart grid poses significant challenges. Therefore, it is critical to fully utilize the data collection capability provided by the smart grid infrastructure to build a trustworthy system.

    This work aims to develop a dynamic information processing mechanism for the multi-layer, multi-protocol, and

    A Multi-layer, Hierarchical Information Management System for the Smart Grid

    Ning Lu, Pengwei Du, Patrick Paulson, Frank Greitzer, Xinxin Guo, and Mark Hadley

    T

    978-1-4577-1002-5/11/$26.00 2011 IEEE

  • IEEE General Meeting 2011-PNNL-SA-77218

    multi-purpose networks of smart grids, to facilitate reasoning about and predictions of failure modes and possible threats, and to identify direct and indirect responses to such failures or threats.

    The contributions of this paper are multifold. First, this paper demonstrates the design of the IMS, which is a comprehensive modeling platform to integrate different aspects of smart grid operations such as wholesale market, power grid models, communication and control network, and custom behavior and environmental factors. While engineering modeling of distribution network has a long history, these modeling tools have limited modeling capabilities and are intended to be used for very narrowly focused applicationsprimarily for simulating the physical characteristic of power grid (i.e., power flow models). The IMS design is featured by modeling the communication and control network, which is the backbone of the smart grid implementation. The benefit is the capability of IMS to allow for evaluation and assessment of the interactions between information and operations. For example, questions concerning how communication network failures can affect distribution system operation can be answered, and the potential vulnerabilities of the smart grid architecture design can be revealed prior to field installation and implementation.

    Another contribution of this paper is the development of a predictive defense model (PDM) based on the information made available by smart grid implementation. In a smart grid environment, operators need to process a large amount of data very quickly and efficiently. This places a significant burden on operators without assistance from automated tools. A PDM automatically identifies the correlation and interdependence of data streams to detect the abnormal situation and improve the data integrity. Additionally, as opposed to being ignored in todays grid, corrupted data and outliers can be detected and analyzed to pinpoint the error sources. A behavior tree is used to associate data with behaviors, and then link behaviors with causes. Data are also normalized based on their historical statistical characteristics to facilitate the fuzzy threshold that triggers a behavior. Determining the causes of these data errors can help operators understand the nature of this data corruption event (i.e., whether they are caused by equipment failures, human errors, or tampering). Because data and information are an integral and critical part of smart grid, better data integrity will improve the reliability and credibility of decisions. The paper is organized as follows. The related work is described in Section II. The prototype of the IMS is presented in Section II. Section III discusses the modeling framework of PDM. The main features of the reasoner and the reasoner implementation are introduced in Section V and Section VI, respectively. Section VII gives the test results. The conclusions and future work are summarized in Section VIII.

    II. RELATED WORK

    A. System Modeling Existing power grid test systems [6][8] emphasize

    modeling of physical network components, such as the load, generation system, or transmission network, but have no or little capability to simulate the integration of information extracted from different data sources. Moreover, complexities and interdependencies among data sets are insufficiently represented. The general purposes of those test systems are to study power flows and voltage, angle, and transient stabilities using power system analysis software to understand how these phenomena can be managed to operate the power grid reliably.

    Existing power grid data modeling systems focus on checking for data integrity and security at the equipment level. For instance, the national SCADA testbed, designed to study the security of the power grid infrastructure, offers the capability to test SCADA equipment and the control system deployed at the transmission level. However, these models do not consider the physics of the power network, such as the voltage-current dependencies.

    In this context, the existing power system modeling environment makes it very difficult to simulate smart grid operations that are closely driven by abundant data flows.

    B. Information Integrity Also related is work involving the techniques for

    information assurance or information integrity. Existing techniques, which are a system-centered paradigm, fall into two categories [9]: anomaly detection and attack signature. For the latter, a repository of attack signatures must continuously be updated to remain useful in changing system configurations, protocols, architectures, and environments. For a particular subject of interest, anomaly detection techniques establish a profile of the subjects normal behavior; compare the observed behavior of the subject with its normal profile, and signal attacks when the subjects observed behavior deviates significantly from its normal profiles. Because the smart grid involves a broad range of technologies, protocols, devices, and functionalities, applications of these techniques in a smart grid impose significant challenges that need to be addressed in considering redundancy, dependency, correlation, and cross-correlations between data sets.

    III. A SMART GRID INFORMATION MANAGEMENT SYSTEM CONFIGURATION

    For a smart grid, system models are built through levels of abstraction that characterize different physical aspects of the distribution network, and system complexity is captured at different levels. The IMS prototype includes different information layers: the network model, the smart meter, advanced metering infrastructure, the SCADA network, the utility customer information systems (CISs), the environment network, and wholesale market model.

  • IEEE General Meeting 2011-PNNL-SA-77218

    A. The Network Model The network model used in the simulation is the standard

    IEEE 13-node model [8], as depicted in Fig. 2.

    646 645 632 633 634

    650

    692 675611 684

    652

    671

    680

    Fig. 2: The IEEE 13-node test feeder

    B. The Smart Meter A smart meter is a digital electrical meter that measures

    energy consumption more accurately than a conventional meter, generally with two-way communication that receives real-time price from utility and sends information back for monitoring and billing purposes [10]. In our base case, the smart meter data is modeled with the data collected in the Pacific Northwest GridWise Testbed project, which is a field demonstration of smart grid technologies that has been funded by the U.S. Department of Energy since 2004. This project monitored energy usages of 112 households and ambient temperatures on a 15-minute basis through an automated two-way communication network for one year, from early 2006 through March 2007. Detailed descriptions can be found in [11].

    C. Advanced Metering Infrastructure (AMI) The wireless smart meters installed for every household are

    connected to the communication network. The modeling of the communication network is a simplified model independent of the communication medium, protocol, and technologies. Each smart meter data set includes an outage rate and a communication error rate. The outage rate represents the likelihood of the smart meter hardware failure. The communication error rate represents the failure probability in the communication media networks. A meter outage rate is associated with each smart meter to determine at each time step if the meter is broken. A corruption rate is associated with each information package to determine if the package is corrupt.

    D. The SCADA Network The electrical systems measurements along the feeder, such

    as voltage, current, power, and energy, are collected through SCADA. The IMS also records the sequence of events. Outage management systems are integrated with SCADA

    systems, which can automatically report the operation of monitored circuit breakers.

    E. The Utility CIS Smart meter data are collected and summarized as billing

    information in the utility CIS. Payment information, such as the billing delinquency and late payment, are included in the modeled CIS database.

    F. Environmental Data Ambient environment data include temperature, wind speed,

    and lighting strikes. In the GridWise Testbed project, temperature and wind speed were collected hourly at a local airport. The two measurement data sets are used in our base case to preserve the correlation between the meter data set and the weather data set. Lighting strikes are a modeled data set to simulate lighting caused equipment failure events.

    G. Wholesale Market In a deregulated market, multiple parties in the bulk power

    systems engage in an open-access market competition with their own economic objectives. The market is bid-based. The Independent System Operator (ISO) collects bids from generation companies (GenCos) and load servicing entities (LSEs), based on which a supply curve, S, and a demand curve, D, can be obtained. The ISO then calculates the market clearing prices, B. Fig. 3 shows how an LSE can interact with the market by installing a price-responsive controller.

    The energy price signals used in the IMS are also taken from the Pacific Northwest GridWise Testbed project to preserve the correlation among data sets.

    Fig. 3. A block diagram of a price responsive LSE

    IV. THE MODELING FRAMEWORK OF PDM The PDM has three major functions: detection, reasoning

    and predicting, and response, as shown in Fig. 4. The goal is to detect a number of threats and failure modes in major communication/control network models and identify cross-correlations and interdependency among events across networks.

    1) Detection In this paper, detection is achieved by deploying the

    detection-in-depth process of information by dynamic filtering, multi-layer triggering, and cross-diagnosis to identify the nature of system events and incidents as security

  • IEEE General Meeting 2011-PNNL-SA-77218

    vulnerabilities in personnel, technology, or operation. Then, response can be taken effectively.

    The idea behind the detection-in-depth is similar to defense-in-depth approach [12]. It detects anomalies using several varying filtering methods against information collected from different control and communication networks.

    Fig. 4: The framework of a PDM for the smart grid

    To accommodate the dynamic filtering process, we

    implement a number of general and specific data filters with an emphasis on dynamic bandwidth/signature settings. Each filter will have a feedback loop from the reasoner. Once a problem is detected, the bandwidth and signature may change accordingly. More data will be sent for analysis. Specific filters may turn on to scan data at a finer detail.

    2) Reasoning and Predicting

    A reasoner is developed for predicting and identifying an abnormal system behavior. Inside the reasoner, incoming data are processed to infer observations, which are then used to infer indicators of behaviors. From these indicators, we can assess the risk of the threat posed by anomalous behavior. Detailed discussion on the reasoner is presented in the next section.

    At the core of reasoner is an interdependency trigger that handles a hierarchically structured reasoning process to enable multi-layer triggering and multi-angle reasoning. The smart grid technology application raises issues such as data overflow and increasing possibilities of cyber security breaches that lead to infrastructure failures, but it also provides the data redundancy that enables cross-checking for data integrity and abnormal behaviors. Because the same device may be measured for different purposes, and the measurements are sent to operators through different

    communication networks, crosschecking enforces data authentication and increases the credibility and reliability of decisions.

    3) Response The PDM response module features stage responses, which

    direct the returning system to a safe operation mode first, and then identify location and provide fixes. In most cases, the reasoner may not be able to identify the nature of an abnormal behavior without a few iterations of data requests and synthesis. At lower information awareness levels, passive response optionsranging from issuing data requests, suspending control privileges, or switching to backup systems may be first taken. Once the possible nature of the problem has been identified, a number of active responses such as reporting malicious attempts to authorities or sending a maintenance crew will be suggested to the system operator.

    V. HIERARCHICAL REASONING PROCESS The reasoner has four unique features in the design to

    help operators detect an abnormal system behavior, as described below.

    A. Multiple-layer Reasoning Process The reasoner is a multi-layered analysis/inference process

    that progresses from data to observations to indicators to behaviors. It consists of two major steps, as follows.

    1) Observations are processed from cyber and operation data to infer indicators

    Observations are based on data and reflect a cognitively meaningful state. Indicators are activities or events that make up the evidence or signature from which behavior (or scenario) is inferred. An individual indicator signifies the presence or absence of a particular property of the modeled entity, such as voltage higher than allowable values, electricity consumption lower than a reasonable value, or an equipment outage. The presence of an indicator is, in turn, determined by particular data values. Indicators are essentially the semantics of insider behavior and characteristicsinterpretations of intentions and actions based on observations.

    2) Indicators are processed to infer behaviors

    A group of indicators is regarded as a manifestation of the behavior (or scenario). Behaviors are sequences of activities or events for achieving a specific purpose. The situation is complicated by the fact that malicious attack and the outages could have much in common. It is thus the aggregation of these activities that needs to be recognized as manifestation of the behavior (or scenario).

    B. Hierarchically Structured Reasoning Process The model of the reasoner uses a hybrid approach based on

    pattern recognition and model-based reasoning. While identifying deviations from normal behavior is part of the

  • IEEE General Meeting 2011-PNNL-SA-77218

    analysis, so too is reasoning about conformance with prototypical behaviors that change seasonally, weekly, and daily. The challenge is to conduct model-based reasoning on the recognized patterns at a semantic level rather than applying template recognition.

    At the highest level, the model comprises a knowledge base of indicators and heuristic models of abnormal behavior. This knowledge base informs all of the components of the IMS model, and is in turn updated or modified by outputs from components that perform functions such as data collection, data fusion, and analysis. Indicators for higher levels of reasoning can be based on patterns discovered at lower levels. In addition, the reasoner can support the triggering of actions when particular patterns are recognized. Some of these actions might result in additional information being analyzed and collected for further reasoning steps.

    C. Redundancy Design A fundamental assumption is that not all possible data can

    be collected continuously, and some data may be unavailable or lost during data transfer. Some data may indicate that additional scrutiny is needed but might not warrant specific action. We therefore adopt an incremental approach to data collection, analysis, and decision making.

    The reasoner examines data sources for the presence of an indicator. Multiple detectors are used for a particular indicator, each potentially examining observables from multiple data sources. This redundancy helps ensure the detection of an indicator even if some data sources are unavailable. In addition, we can draw observables from different information layers; the desired behavior can then be modeled using indicators drawn from different data collection points, integrating data from the available information layers.

    D. Probability-based Reasoning The existence of a particular behavior (or scenario) is often

    not an all-or-none proposition. We are often interested in the degree to which the operation of distribution grid is affected or the amount of evidence pointing to smart-meter tampering. To support this kind of understanding, detectors need to provide a graded level of support for a particular indicator. This approach yields a representation of the confidence levelan uncertainty concept more general than probabilitythat a concept applies to a particular entity for a point in time. The degree of belief that a state is assigned varies with the observed indicators.

    VI. REASONER IMPLEMENTATION

    A. Data Flow Fig. 5 depicts how the data flows in PDM. The main data

    processing components include data collection of observable data, data preprocessing module, the reasoner, and the response module. Observable data include meter data, price signal, SCADA data, communication network errors, utility CIS, and environmental data as described in the Section III. In

    the preprocessing step, the data collected is normalized with respect to the normal value to eliminate the scale effect for different measurements. A historical database is used to save the output of the reasoner at each time step; this result is recalled in the next time step as the behavior (or scenario) tracked previously. The predictive model provides expert knowledge of abnormal behaviors, and this heuristic model is updated upon the arrival of new data. After a possible failure mode is detected or a cyber security concern is raised, the response module will be triggered to request further scrutiny or require operator intervention.

    }

    Fig. 5: Data flow diagram

    B. The Reasoner Unlike traditional approaches that rely exclusively on

    pattern recognition and anomaly detection, the reasoner [13][15] uses a hybrid pattern recognition approach that assesses not only deviations from norms but also conformance with prototypical exploits and behaviors that have been identified. It is characterized by a two-step process.

    1) Detector

    First, the reasoner uses multiple detectors to examine the data sources, and each detector processes data independently. In this paper, the data sources examined are energy, power voltage, current, communication status, weather, neighborhood energy usages, and customer delinquency information. For each detector, the measurement data are compared to the information passed from the predictive model for abnormality. The information from the predictive model is a threshold with various forms. The thresholds can be data-driven or specified; a threshold can be a value or a fuzzy set. In addition, some thresholds are values accumulated over time, for example, the attack attempts and the customer delinquency record.

    In some cases, the threshold is a scalar value. For example, the tripping of a circuit breaker either occurs or does not. For some measurements, however, the threshold may have two or more values. For example, voltages below 95% or higher than 105% per unit (p.u.) value are considered under-voltage or over-voltage, respectively. In addition, there is a duration requirement. Only when the system voltage is higher or lower than the per unit value that exceeds a duration of t is it considered an over- or under-voltage event. Therefore, these

  • IEEE General Meeting 2011-PNNL-SA-77218

    thresholds are vectors. Another important consideration is the design of detectors

    for non-stationary processes. One example is the household energy consumptions that vary daily, monthly and seasonally. Therefore, it is no longer appropriate to use a fixed threshold throughout a year to determine whether or not the energy consumption of a household is too high or too low. Thus, historical energy usage pattern and present weather data can be used to create a more inclusive context for these measurements to account for the variations in these measurements.

    As mentioned above, we are often interested in the degree to which an indicator of abnormality is present. When this is the case, a threshold may be better described by a fuzzy set or a vector containing threshold values. In Fig. 6, monthly energy usages of 50 households are plotted in the left figure and the threshold derived is plot in the right figure. If a house used 70 kWh in the month, it is marked as energy-use-low, with a fuzzy membership at 0.42.

    0 10 20 30 40 500

    50

    100

    150

    200

    250

    300

    350

    400

    450

    500

    above High limit

    below Low limit

    Household ID

    Ene

    rgy

    Con

    sum

    ptio

    n (k

    Wh)

    0 0.2 0.4 0.6 0.8 10

    50

    100

    150

    200

    250

    300

    350

    400

    450

    500

    Ene

    rgy

    Con

    sum

    ptio

    n (k

    Wh)

    Value of the fuzzy membership function

    Fig. 6: The threshold of the household monthly energy usage

    2) Behaviors (or Scenario) Inference The second step of the reasoner is to synthesize the results

    from detectors to infer behavior (or scenario). Fig. 7 illustrates a single time-step in our analysis.

    Fig.7: Synthesis of indictors to infer behavior (or scenario)

    The reasoner accepts the current plausibility measures from

    detectors for behaviors along with the previously determined plausibility values for behaviors. The reasoner maps the observable inputs into the vocabulary of a fuzzy finite-state automata and the behaviors into the current support for states in the automata. The transition function of the automata is invoked, providing new plausibility values for the behaviors to use as input for the next time period. The reasoner continually assesses current indicators in combination with previously inferred indicators and behaviors to determine the plausibility of threat behaviors.

    VII. TEST RESULTS In an IMS environment, abnormalities can be simulated. An

    abnormality can have several possible causes, such as bad weather, equipment failure, cyber attack, and energy theft. For example, in energy theft scenario, the smart meter reading saved in baseline is manipulated. As a consequence, it can result in several abnormal phenomena (or data corruption):

    Energy bill is significantly lower than average Electricity consumption of neighboring households is

    higher than normal Electricity consumption pattern is irregular Energy consumption deviates from its past patterns Outage rate of the tampered smart meter is high Communication package from the tampered smart

    meter can be easily lost No energy discount issued to the customer

    As shown in Fig. 8, one scenario (equipment failure, cyber attack, or energy theft) can trigger multiple changes in the baseline data set; on the other hand, a change in the baseline data set may be tracked back to different causes (or scenarios). This highlights the difficulty of detecting an abnormality using a single database and clearly shows the need for the synthesis of multiple data sources.

    Adjac

    ent ar

    eas

    Cyber AttacksCyber Attacks

    Equipment Failure

    Equipment Failure

    Energy Theft

    Energy Theft

    Voltage

    Energy

    Feeder outage

    Meter outage

    Com outage

    Time

    Space

    Payment History

    Technical

    Neighborhood Background his

    Lighting/Wind/T(quakes/etc)Lighting/Wind/T(quakes/etc)

    Tree/Snake/SquirrelTree/Snake/Squirrel

    Attack HistoryAttack History

    Billing

    Environment

    Cyber

    Equipment Attack PossibilityEquipment Attack Possibility

    Fig. 8 Examples of scenario simulation

    After a scenario is set up, the corresponding data set will be

  • IEEE General Meeting 2011-PNNL-SA-77218

    examined by detectors. The reasoner uses a series of logic steps as a descriptive means for determining the interdependence between data and different scenarios and calculating probabilities. The logic process is adaptively changed to reflect different operation conditions (for example, seasonal variations). One example of the output given by the reasoner is shown in Fig. 9, which lists several scenarios with bars representing the probability of occurrence. The most likely cause for the tampered data received is bad weather, while the chances for other events like cyber attack or energy theft are considerably low.

    Bad WeatherOutageCyber AttackMeter OutageEnergy theftComm ErrorsNorm OutageLack of Data

    1.00Bad WeatherOutageCyber AttackMeter OutageEnergy theftComm ErrorsNorm OutageLack of Data

    1.00

    Fig. 9. One example of output from the reasoner

    VIII. CONCLUSIONS The benefit of an investment in smart grid infrastructure is

    the availability of high resolution data collected from more measurement points. To monetize the benefit, information that can facilitate the grid operation, maintenance, and planning needs to be extracted from these data sets and made available to grid operators in an actionable manner. Therefore, in a smart grid infrastructure, there is a critical need to analyze the data collected by multiple control and communication networks 1) to characterize the states of the smart grid and 2) to improve data integrity by detecting corrupted or erroneous measurement data. To this end, we developed a multi-layer, hierarchical information management system. In this paper, we presented design considerations, physical and information network configurations, data flow generations, detector design, and testing results of the PDM.

    The contributions of this paper are summarized as follows: 1) This paper demonstrates the design of the IMS, which is a comprehensive modeling platform to integrate different aspects of smart grid operations such as wholesale market, power grid models, communication and control network, and custom behavior and environmental factors. 2) A reasoner is developed for predicting and identifying an abnormal system behavior. A reasoner automatically identifies the correlation and interdependence of data streams to detect the abnormal situation and improve the data integrity.

    REFERENCES [1] Litos Strategic Communication, The Smart Grid: an Introduction, http://www.oe.energy.gov/1165.htm.

    [2] J. Meserve, Smart Grid may be vulnerable to hackers, http://www.cnn.com/2009/TECH/03/20/smartgrid.vulnerability/index.html.

    [3] J. Osborne, Electrical Smart Grid Not Yet Smart Enough to Block Hackers, http://www.foxnews.com/story/0,2933,511648,00.html.

    [4] PMU guidance: http://www.naspi.org/pmu.stm.

    [5] C. McKenna, Smart Grid Security Requirements Released, http://www.govtech.com/gt/626637.

    [6] APM Subcommittee. IEEE Reliability Test System, IEEE Trans. on Power Apparatus and Systems, Vol. 98, pp. 2047---2054, 1979.

    [7] IEEE Distribution Planning Working Group Report, Radial distribution test feeders, IEEE Trans. on Power Systems, Vol. 6, No. 3, pp. 975---985, 1991.

    [8] Distribution Test Feeders, Distribution Test Feeder Working Group, IEEE PES Distribution System Analysis Subcommittee. Available online: http://www.ewh.ieee.org/soc/pes/dsacom/testfeeders/index.html.

    [9] N. Ye, J. Giordano, and J. Feldman, CACSA process control approach to cyber attack detection, Communications of the ACM, Vol. 44, No. 8, pp. 7682, 2001.

    [10] Smart Meter. Available online: http://en.wikipedia.org/wiki/Smart_meter.

    [11] D. J. Hammerstrom, J. Brous, T.A. Carlon, D.P. Chassin, C. Eustis, G.R. Horst, O.M. Jrvegren, R. Kajfasz, W. Marek, P. Michie, R.L. Munson, T. Oliver, and R.G. Pratt. 2007. Pacific Northwest GridWise Testbed Projects: Part 2. Grid Friendly Appliance Project. PNNL-17079, Pacific Northwest National Laboratory, Richland, WA.

    [12] Defense-in-Depth, http://en.wikipedia.org/wiki/Defense_in_Depth_(computing).

    [13] P. Paulson, et al., A Methodology for Integrating Images and Text for Object Identification, Proc. ASPRS 2006, Prospecting for Geospatial Information Integration, 2006.

    [14] F. Baader and W. Nutt, Basic Description Logics, The Description Logic Handbook, F. Baader, et al., eds., Cambridge University Press, 2003, pp. 43-95.

    [15] T. E. Carroll, P. Paulson, F. Greizter, R. Hohimer, and Lyndsey Franklin, Reasoning About the Insider Threat, IEEE Symposium on Security and Privacy (IEEE/SP 2010), May 1619. 2010, Oakland, CA.

    Ning Lu (M98-SM05) received her B.S.E.E. from Harbin Institute of Technology, Harbin, China, in 1993, and her M.S. and Ph.D. degrees in electric power engineering from Rensselaer Polytechnic Institute, Troy, NY, in 1999 and 2002, respectively. Her research interests include modeling and analysis of power system load behaviors, energy storage evaluation, renewable integration, climate impact on power grids, and smart grid

    modeling and diagnosis. Currently, she is a senior research engineer with the Energy Science & Technology Division, Pacific Northwest National Laboratory, Richland, WA. She was with Shenyang Electric Power Survey and Design Institute from 1993 to 1998. Pengwei Du (M06) received B.Sc. and M.Sc. degrees in electrical engineering from Southeast University, Nanjing, China, in 1997 and 2000, respectively, and his Ph.D. in electrical engineering from Rensselaer Polytechnic Institute, Troy, NY, in 2006. He has worked at Pacific Northwest National Laboratory, Richland, WA, since 2008. His research interests include distributed generation, power system modeling and analysis, and energy storage applications. He has published over 10 journal or conference papers and holds 2 patents. He is the member of WECC modeling and validation working group.

  • IEEE General Meeting 2011-PNNL-SA-77218 Patrick Paulson is a Senior Research Scientist in the Knowledge Sciences group. Patrick received his Ph.D. from North Dakota State University in 2001, where his research interests were reinforcement learning and case-based reasoning. His current research interests include knowledge representation, semantic computing, provenance representation, uncertainty, and fuzzy logic. While at PNNL, he has developed models of socio-cultural factors and insider-threats and managed research projects in semantic similarity, semantic fusion of image and text documents, and applying risk modeling to cyber security. Frank L. Greitzer received his B.S. in mathematics from Harvey Mudd College, Claremont, CA, in 1968 and Ph.D. degree in mathematical psychology from University of California, Los Angeles in 1975. His research interests include human decision making/information processing and human information interaction concepts for enhancing operator decision making in complex environments such as intelligence analysis and power grid operations. His research interests also include evaluation methods and metrics for assessing effectiveness of decision aids, analysis methods, and displays. Dr. Greitzer is a chief scientist for cognitive informatics in the National Security Directorate, Pacific Northwest National Laboratory, Richland, WA. Xinxin Guo received a B.S. of electronic engineering from Dalian Maritime University in 1997 and M.S. degrees in engineering from the University of Wisconsin at Madison in 2007. He is currently a research scientist at Pacific Northwest National Laboratory in Richland, WA. His research interests are in power system dynamics and stability, demand response, energy storage, and smart grid.