rsrch nofig

8/14/2019 rsrch nofig

1/23

Distributed Compositional Operations forAggregation and Visualization of Cyberspace Data

A Proposal submitted to the

Air Force Research Laboratory

AFRL/RH Human Effectiveness Directorate

Det 1 AFRL/PKHA, Bldg 1672310 Eighth Street

Wright-Patterson AFB, OH, 45433-7801

BAA-08-02-RH (AFRL/RH FY08 HBCU/MI Set Aside Program)

Contracting POC: Rhonda L. Powderly, (937)656-9046

(Technical Area No. RHC-1: Synthesizing the Dynamics of CyberspaceVisual Renderings for Distributive Systems to Characterize

Cyber Attack, Performance, and Vulnerability

Amount: $99,887 Duration: 14 months Start date: August 2008

Principal Investigator

Prof. Kaliappa Ravindran Phone number: (212)650-6218Department of Computer Science Fax numb er: (212)650-6248The City College of CUNY Email address: [email protected] Convent AvenueNew York, NY 10031.

Authorized Institutional Representative

Regina Masterson, Director Phone number: (212)650-5418

Office of Research Administration Fax number: (212)650-7906The City College of CUNY Email address: [email protected]

1


2/23

Distributed Compositional Operations forAggregation and Visualization of Cyberspace Data

PI: K. RavindranCity University of New York

Contact information: Email: [email protected]; Ph.: 212-650-6218.

TASK SUMMARY

The proposed project deals with designing composition operators for data aggregation and vi-sualization in a geographically distributed information network. The composition functions areprescribed by the application to act upon the data collected from the external environment. Thecomponent data may pertain to a certain geographic area being observed and/or disparate sub-systems being controlled. The fusion of a set of such data components for dissemination purposesinvolves applying logical operations on these data to yield a high level representation of the exter-nal phenomenon being studied. An example is the assessment of terrorist threat level in a certaingeographic area based on the monitored communications, the social demographics, the time-frame

in question, and the presence of high-profile targets. For such complex application domains, theexisting models of using simple syntactic rules for data aggregation are not sufficient. Instead,multiple interpretations of the low level data should be composable for analysis purposes. A visualrepresentation of the composed data will also allow the human commanders to aid more complexcompositions of the data. In this project we propose semantics-based operators that can be in-stalled at aggregation nodes in order to customize an application-specific interpretation of the rawdata. Such data aggregation nodes are part of a overlay tree set up over the distributed informa-tion network, working as application proxies to drive the necessary operations on data collectedfrom the external environment. The semantics-aided data aggregation can improve the situationalawareness of applications (be it in military or commercial or industrial settings), in comparison tothe currently prevalent syntactic rule-based models of processing data.

We shall develop the proxy-based overlay architecture for data aggregation, with a set of re-lational operators that can be applied on the target data items. Both the spatial and temporalcharacteristics of data can be incorporated in the fusion operators that are installed at the proxynodes of the aggregation tree. The formal properties of the fusion operators will be identified,and a prototype implementation of the fusion system will be undertaken. GUIs will be developedthat allow the human users to install customized functions and operators in the proxy nodes andvisualize the data in different ways. Case studies of applications will be undertaken for analyzingdenial of service attacks on distributed networks and for application-specific QoS control withdashboard-like visual interfaces, to demonstrate the benefits of our semantics-based approach.

1


3/23

TECHNICAL APPROACH

We begin with generalized structure of a geographically distributed system maintaining differentdata components that are related. It provides the basis for our proposed overlay tree based dataaggregation using application-supplied fusion operators.

1: Behavioral view of distributed systems

The distributed structure of a system consists of multiple computation nodes (or sub-systems)that collaborate with one another to carry out an application task. The task execution state isdetermined by the spontaneous changes occurring in external environment and the service-levelrequirements prescribed by application entities. The meta-data pertaining to the external envi-ronment and the service-level needs of applications are made available to the control task througha set of sensors and actuators. Human elements and/or intelligent agents at distinct geographiclocations are part of the distributed control system (DCS), interacting with one or more nodesto obtain strategic decision support. In a digital battle field for instance, the environment mayconsist of infrared sensors to collect data about enemy troop and machinery movement in a certainterrain. The computation subsystems are interconnected by a geographically distributed informa-tion network that make the relevant pieces of data available to these subsystems at various pointsin time, to enable them carry out the application task [1].

1.1: Modular structuring of a DCS

The behavior of a DCS as a whole may be continually changing, sometimes to even below acceptablethreshold levels of operations. This may be due to sub-system level components (viz., computa-tion nodes and/or their interconnections) being forced to operate in a degraded mode because ofundesirable changes occurring in the external environment (say, denial of service attacks on anetwork) and/or component failures. Visible changes in a system behavior mean that the behavioris measurable in terms of concrete parameters of the system. See Figure 1.

Consider, as an example, sensor devices deployed in the field of operations, say, to detect enemymovements. These sensors are prone to failures (such as the loss of battery power to sensors).At a macro level, the reliability of inferences made about enemy movements, which depends onthe accuracy and timeliness of sensor data, is a measurable property of the detection system.Some form of majority voting technique (such as at least 10 sensors in the field have detectedenemy movements) may be employed to increase the quality of inferences. The decision-making bycommanders, say, about attacks on enemy positions, may itself may be based on such a quantitativeassessment of the inference reliability (or confidence level) parameter1.

Consider a case of public health surveillance system. When the incidence rate of asthma in acertain geographic area exceeds a threshold (say, as observed from the number of patient visits toclinics), the system may dispatch additional medical resources to the area (such as ambulances,medicine supplies, and nursing staff). The system may also initiate actions that have large time-scale effects such as reducing the aerosol concentrations in the atmosphere. A combination ofsuch corrective actions may be put in effect until the asthma incidence rate falls below the threshold.

1The sensors consitute the resources that improve the quality of inference in the battlefield application.

2


4/23

Figure 1: Structure of a geographically distributed information network

Since an affected subsystem impacts the visible behavior of a DCS in some way, we need toinstrument mechanisms for detecting deviations from the expected behaviors. To enable construc-tion of these mechanisms, a DCS should be modularized in such a way that the overall behavior ofinterest can be inferred from the easily controllable and/or observable behavior of system compo-nents in various geographic areas. The high-level inference involves aggregating the data collectedthrough the various components.

1.2: Overlay trees

Our aggregation architecture is based on a tree-structured overlay set up over an information net-work, in which the root node is attached to a data dissemination station and the leaf nodes areattached to the data collection modules in different geographic regions. An intermediate mode inthe tree aggregates the data emanating from its downstream nodes, and this partially aggregateddata propagates up the tree towards the root. See Figure 2.

Suppose, in the earlier example of sensors, it is stipulated that the confidence level in thereliability of enemy movement information in a certain area exceed a certain threshold, say 70%,in order to initiate a pre-emptive strike on enemy positions. Here the overlay node carries out a

majority voting on the sensor data collected at the leaf nodes. When the number of active sensorsfalls below the corresponding majority figure in the number of sensors detecting enemy movement(say, 10), the detection system can no longer be able to supply the information at the critical levelof reliability. The system should then be able to notify a failure to the C2 center, so that thelatter can resort to alternate means of inferring enemy movements.

3


5/23

Figure 2: Tree-structured overlay for semantics-driven data fusion

The criteria used for data aggregation needs to be embodied into the functions placed in theoverlay nodes. These functions act as proxies for the application, in that a projection of the data

processing needs in a certain geographic area is encapsulated into the overlay function serving thatarea.

1.3: User-level control of data aggregation

A high level inference of events in the cyberspace, say, for situational awareness, may involve manyexamining many data components, and require capturing the causal and timing relationship be-tween them. Condier an example of health assessment and fatigueness of soldiers deployed in ageographic region. The influential parameters are, say, the climatic conditions, social demopgraph-ics (of the local p opulation), distance from the nearest base, and the medical facilities. Theirrelationship may often not be expressible in a closed form, due to the non-separable ways in whichone parameter may impact the others. In general, only imprecise information may be available inthe system, compounded by the lack of quantitative models of the impact of various parameters.

Often, the occurrence of an event e may influence the prescription of which future eventsare possible in the system state in which e occurred. The large problem dimensionalities andincompleteness in system-level characterizations often make the required event prescriptions not

4


6/23

known before-hand at the start of a data aggregation procedure. In the absence of comprehensiveanalytical models to capture the relations between the various data components, heuristics-basedmodels offer effective means of relating them over specific operating regions of the system. This inturn calls for an active participation of human users during the event inferencing over a distributedinformation network.

User participation requires an ability to reconfigure event prescriptions as the partially aggre-gated data propagates upstream along the overlay tree. In an example of target tracking over aterrain, the system may initially look for certain simple patterns in the terrain images. When these

patterns so appear in the images, there may be a need to examine other image-level attributes ofthe terrain, such as the patterns induced by, say, dense clouds over the terrain, to delineate thesuspicious objects from the aberrations caused by other environmental parameters. Thus, we needuser-level interface tools for prescribing data aggregations, say, with visual representations of themeta-relationships between data elements.

2: Relationship to current models of data aggregation

There are proposals studied elsewhere for in-network data aggregation, particularly, in the contextof sensor networks such as TinyDB [5] and Cougar [4]. The goal of in-network data aggregation is

to reduce the energy costs that are otherwise incurred when raw data are sent to the disseminatingnodes. And these works are mainly for homogeneous data types, such as all the sensors observingthe environment temperature in a given region. In contrast, our proposal is for heterogeneous datatypes, with the rules of aggregation determined by the semantic relationships between various data.

Directed diffusion techniques [6] employ an attribute-based naming of data, which provides abasis for specifying the relationships between them. But the goal of this work is more on the networkprotocols and algorithms for in-network aggregation, and less on exploiting the data semantics inthe aggregation operations. The SensorML language developed elsewhere [7] allows defining theexternal characteristics of a physical sensor device using XML-based schemas. This language isspecifically designed for satellite-based sensor data collections; its use for defining high level datasemantics has not been explored.

Virtual Sensors [14] is an abstraction that allows heterogeneous physical sensors to be vieweduniformly through a canonical interface. It provides a publish-subscribe interface to the sensordata, for applications to extract the aggregated data they need. This work however focuses moreon network-oriented data and parameters, with the main goal being to reduce the energy of sensordevices operating in wireless mode.

In comparison with the above works, our proposal differes in two significant areas. First, itoffers a framework to fuse disparate data items based on the semantic relationship between them.Second, our notion of sensors is far abstract, in that any data collection system in the externalenvironment of an application is treated as a (logical) sensor. For instance, a question-answersession with a soldier in the field can be abstracted as a sensor that collects data about the mentalhealth and fatigueness of the soldier. Thus, our generalized notion of sensors and data fusion goes

beyond the existing notions of physical sensor devices. The meta-model of sensors in our approachcan thus be instantiated in diverse application domains (with domain-specific stub interfaces, ofcourse) thereby reducing the system development effort and costs by software re-use.

5


7/23


8/23

An aggregation of events at various nodes in the tree typically enlarges the time-scale of changesin the resulting macro-level data. An example is to determine if there is a sustained lack of moralein the deployed soldier battalion, based on the observations of spatially separated smaller groups.The overall loss of morale is then a maximum of the per-group observations. Since the maxoperator selects the highest of a set of rapidly fluctuating per-group morale levels at any given time,the composite metric varies slowly in comparison. Consider another example, namely, vehicularnetworks. Here, the traffic congestion on a given route is the maximum of the reported congestionlevels in the various stretches of roads along that route.

A domain-specific interpretation of the events in different regions cannot be adequately capturedwith the standard mathematical operators of aggregation as argued in [14]. For example, theeffect of a morale loss in one group of soldiers on the overall battle-readiness of the battalionspanning the adjoining regions cannot be expressed through simple syntactic connectives. Instead,the role of the group in the overall battle plan needs to be taken into account. This motivates theneed for a semantic knowledge in interpreting events.

3.2 Aggregation using semantic knowledge

Information network applications often require abstracted measurements of the diverse environmentphenomena (or events) in various geographic regions. These measurements need to be interpreted

using a semantic relationship between the events (which may take into account the weak con-sistency and the temporal correlation among events [13]). Typically, the confidence level in thereporting of a combined event can be increased with a semantic knowledge that interconnects thetwo independently reported events.

As an example, consider the detection of a plane (in terms of speed and location) by the devicesin region 1 followed by the detection of a plane by the devices in an adjacent region 2 after acertain time interval T. If the geographic distance between regions 1 and 2 depicts a flight timeclose to T at the given speed, then it is highly likely that the object detected in regions 1 and 2refers to the same plane. So, when the detection reports from regions 1 and 2 arrive at the overlaynode Z, the latter may aggregate them into a single report with a confidence measure higher thanmax({1, 2}). The timing correlation in the two reports increases the confidence level of the

combined report to higher than that of the individual reports.Where semantic knowledge is used, the aggregation operations on two events may have to

be carried out in a certain order (i.e., the operations may not satisfy the commutativity and/orassociativity properties). Typically, each overlay node may implement the required synchroniza-tion between the arrival of various data items from its downstream nodes, based on the orderingrelationship between the data items such as the causal relationship between events.

3.3 Replica voting based on data classification

A computer algorithm or system sampling an environment parameter is less-than-100% certainabout the accuracy of the indicator due to the large dimensionality of input data (e.g., a question-

answer session with soldiers to reason about their morale). To increase the confidence level onthe outcomes in a quantifiable way, we resort to replica voting among the various computationalmodules. Here, voters are the computational evaluation algorithms that are replicated to observethe same external parameter (or, object).

7


9/23

Figure 4: Functional view of mapping feature spaces to objects

We denote the data classifier implemented by a voter algorithm Vi as Mi(Fi, Li)|i=1,2,,N,where Mi is an algorithmic procedure operating on a feature set Fi to describe the parameter

values of a class O. In the earlier example to assess the morale of soldiers, a question-answersession may pose 50 questions and examine their answers (so N = 50). The outcome may be thedetermination of morale as one of 4 levels: EXCELLANT, HIGH, MEDIUM, and LOW whichconsitiutes O. Typically, The computational procedure Mi may employ some form of MarkovModeling for object classifications [10].

For notational purposes, a canonical structure of Mi may take the functional form:

Mi : Fi Li O,

where Li is a set of logical formulas applied on the feature values instantiated for Fi. See Figure 4for illustration. Mi often employs statistical pattern recognition techniques to obtain a functional

view of the data space of Fi sensed by Vi [12], as captured through the logical formulas Li.Consider an example of detecting the incidence of disease outbreak, say, Malaria, in a geographic

area where troops are deployed. Here, Fi may depict the input data features that describe theconduciveness of mosquito breeding: such as water stagnancy and land marshness, atmospherictemperature, and altitude of the terrain; whereas, O depicts the various types of Malaria outbreaksand their intensity levels. The algorithmic procedure Mi may use, say, the water stagnancy andmarshness as the primary features to first determine the possibility of Malaria outbreaks: suchas a rule clause stagnancy > 6 days; thereafter, the atmospheric temperature and the populationdemographics may be used to distinguish different types of Malaria Li is a set of such clauses.These clauses can be represented as nodes of an aggregation tree.

3.4 Confidence levels associated with data

Suppose o = M(F, L) is a data classifier that detects an object o O with 100% certainty.Here, F is an exhaustive enumeration of features in terms of which O can be completely describedusing an appropriate set of logical formulas L. Relative to this ideal case, a data classification

8


10/23

algorithm oi = Mi(Fi, Li) has |Fi| |F| and |Li| |L

|. Though the non-represented featuresF Fi are deemed as less important by Mi, they do have some impact on the ability of Mi todistinguish one object from another in certain value regions of the missing feature space.

We then say that oi o is a measure of the accuracy of the device Vi. The uncertainty in the

detection of an object by Vi is captured by a confusion probability parameter pi, i.e., the probabilitythat an object oi reported by Vi indeed matches with the actual physical world phenomenon o

within an accuracy level where 0.5 pi < 1.0.

Voting among such replica devices provides an overall confidence level that is higher thanthe per-device confidence level in the system, i.e., max({pi)}i=1,2,,N) < < 1.0. This algorithmicrequirement is expressed by the mathematical relation:

(1

N1K1

[1 pi(Xj)]

K1+1K2) > , (1)

where K1 and K2 are the number of consenting and dissenting votes on a data Xj O proposedby Vi assuming that devices have the same capability for object classification. For example,

pi = 0.85 and N = 10 can achieve a confidence level of 98% with replica voting3.

The users who disseminate the data are part of the voting functionalities. The computationalaspects of the voting can be incorporated as part of a GUI in the data aggregation software tools.

The enforcement of timeliness constraints requires knowledge of the overlay tree topology and thedata delays incurred in the various fusion path segments.

4: Specification of semantic knowledge in middleware

4.1 Temporal and spatial dimension of information elements

An information element Ii|i=1,2, may assume a value vali(t, s) at time t in a spatial location s.Certain changes in valis may indicate a significant deviation in the external environment relativeto its current operating point, which may be deemed as an event [2].

We may prescribe a time scale and a spatial scale (i, i) over which the changes in Ii occur, say,

i is an average time interval between the changes and i is an average area of the region affectedby the changes. For example, a noticeable change in aircraft coordinates provided by a trackingradar may not occur any more frequent than once in 5 sec. The and parameters may determinethe sampling parameters for the sensor data emanating from external environment.

Suppose an event pertaining to Ii is observed to occur at time t and at a location s. A datafusion procedure starts at time t > t in a region s that is close to s. The fusion procedure shouldcomplete within a time duration > 0, where (t + ) < (t + i). In other words, the fusionprocedure should complete before a next change in Ii occurs on the average (otherwise, changesoccur faster than they can be handled). The parameter depicts the time scale of event occurrences,whereas the parameter depicts the geographic area of impact of events.

3Referring to section , Replica voting on fuzzy data may be viewed as a knowledge-based data aggregationprocedure executed at a leaf node. Here, the goal is to generate a single event notification with a base confidencemeasure that is higher than pi c.f. Equation (1). The semantic knowledge is that when two devices report thesame datum with confidence levels of pi1 and pi2 , the leaf node can accept the datum with a confidence level higherthan min({pi1 , pi2}).

9


11/23

4.2 Event predicates

In general, an event may be represented as a condition on the data components O collected fromthe external environment. An event is said to occur when user-prescribed condition on the datacomponents holds, i.e., a predicate L(O) = true where L( ) is a logical formula applied on thedata space O.

L is an applicative function that maps the observed value of data to a boolean result (such as>, T2)).

Suppose the missile system which was readied for firing needs to be pulled back when p moves toa distance beyond a threshold T1 (> T1) which implies that

d{loc(p)}dt

> 0 to avoid a firing4.This relation may be expressed as:

evnt spec(NOFIRE) ((avg(loc(p)) > T1) (d

{loc

(p

)}dt > 0)).

The radar device that samples loc(p) is embedded into the snapshot-taking mechanism which filtersthe timed samples from the radar.

Consider the earlier example of determining the inference reliability parameter for a bank ofsensors that detect enemy movements in a battle terrain. A property of this surveillance functionmay be prescribed as:

evnt spec(SENREL) ((Ns > 10) (Ng

Ns> 0.7)),

where Ns is the total number of sensors in the bank and Ng is the number of non-faulty sensors.

The underlying data fusion procedures determine Ng from a knowledge of the set of sensors inthe bank (such as majority voting among sensors), which allows the evaluation of predicates.With the above form of predicates loaded into the data fusion GUI, the data fusion procedure

4That T1 > T1 implies a hysteresis during increasing trends of loc(p), to avoid chattering of the missile system.

10


12/23

computes the applicative functions avg( )/ d()dt

and


13/23

with one another (unlike benign failures which are statistically independent of one another) [17],the temporal and/or spatial relationships between the various observed symptoms can be easilygleaned using a tree-like visual representation of the relationships. For instance, the loss of acommunication link in the network can be correlated with a reduction in the data arrival rate at anend-point (such as the events ey and ex in Figure 1). Here, the link will appear as a lower node inthe tree and the data rate will appear as an upper node. All the causal events that can impact thedata arrival rate (including the link availability) will then be represented as lower nodes connectedto the upper node.

The case study will involve using a CISCO router based network testbed with Spirent trafficanalyzers that are available at CUNY. The traffic analyzers can inject arbitrary amounts of dataat selected points in the network to simulate different types of DOS attacks. Delay and bandwidthsensors will be implanted at the observation points. Where necessary, multiple sensor algorithmswill be installed (say, different bandwidth estimation methods) to enhance the confidence levels inthe event reporting by replica voting.

The goal of the case study will be to assess the effectiveness of visual GUI aided data aggregationmethods in accurately detecting DOS attacks.

5.3: A case study of dashboard-like visual interface for QoS control

Typically, QoS control is driven by applications based on how important the data being analyzedis and the situation on hand. In a non-combat scenario for instance, a moving object detected in aterrain suffices to have a coarse interpretation. Whereas, in a combat scenario, the moving objectneeds to be closely examined in order to detect, say, enemy planes. The latter involves more complexalgorithmic procedures for target recognition purposes and hence incur more computationaland communication resources when compared to a casual examination of the objects (a study ofAWACS from an adaptive QoS standpoint is given in [16]). Thus, the application requirementsdirectly control the amount of resources expended for detecting the objects in a terrain. Since theresources are scarce in a battlefield setting, conserving the resources with an appropriate high-levelQoS interface is important.

A dashboard is a high-level visual interface that provides the human users with a simple means

of controlling the QoS and hence the computational and communication resources [18]. SeeFigure 5. The dashboard provides sliding bars that depict the normalized ranges of QoS metricsfor the application. It also provides a resource indicator that represents the available resources(in a macroscopic way)5. The QoS metric bars capture two coarse parameters: timeliness andaccuracy of the object detection. These metrics are not orthogonal from a resource allocationstandpoint, because improving the timeliness of a report will involve using simpler and less expensivecomputational algorithms, al beit, with a reduction in the accuracy of the report. The user canslide the bars appropriately to control the timeliness and accuracy parameters.

In the underlying implementation, the timeliness and accuracy parameters will be mapped ontodomain-specific stub procedures to control the computational and communication resources. Forinstance, sliding the the accuracy bar to the higher end will invoke more complex algorithms (with

a large Fi and Li); it may also possibly use multiple algorithms to enhance the accuracy by replicavoting. Accordingly, the system will update the timeliness indicator on the dashboard, so that the

5An analogy is the fuel indicator in an automobile dashboard that allows the drivers to decide how far they cantravel.

12


14/23

Figure 5: A schematic of dashboard-like visual interface for QoS control

user can guage the delays involved in the reporting of objects. Network monitors will sample thevarious resource information (such as bandwidth, links, processing speeds, and battery power) andaggregate them to update the resource indicator bar.

We shall use the network testbed at CUNY to realize the dashboard interface. Standard imageprocessing algorithms [?] will be constructed in a layered fashion, and then applied on (publiclyavailable) target data to establish the feasibility of dashboard-like QoS control.

5.4: Software architecture of data fusion system

There are three functions in our model of data fusion procedures (DFP): the event transformer,event detector, and event dispatcher. See Figure 6. We plan to use languages such as SQL andXML to program the components of DFP. These languages provide relational constructs necessaryfor object-level data aggregation. For event notification purposes, we shall develop JAVA-basedprogram interfaces to the DFP procedures.

The event transformer maps the physical manifestation of an event into a form that can beobserved by the DFP. The user may supply the required mapping functions through the GUI. Theevent dispatcher maintains a registry of predicates supplied by the application (the registry may beviewed as an event rule library). It invokes the DFP with a predicate definition and appropriateparameters. When a predicate becomes true signifying the occurrence of an event, a call back tothe application may occur (say, to generate an audible siren and/or a visual red alert from a GUI).

13


15/23

Figure 6: Software components of semantics-driven data fusion system

5.5: Developing menu-driven GUI

We plan to develop an interactive menu-driven window interface between human users and themachines implementing the DFP. The menus will basically identify the set of system functions asa set of buttons displaying the verbose description of the physical events and the data describingthem. When a user clicks on a button, a value menu space will pop open, indicating the possiblevalues for the function type clicked. A set of applicative procedures prescribed by the DFP willalso be loaded into the desktop as icons. Using these buttons and icons, a user can profile anevent as satisfying any desired condition. Users can profile different system properties by clicking

on these buttons and invoking pre-loaded applicative procedures in the desktop to operate onthe variables. Event causality relationships system interface events may also likewise be prescribedfrom the menu-system.

The user-friendliness of the menu-driven window interface to the DFP may allow users withwith less experience in distributed programming and algorithms development to inject a varietyof critical scenarios when conducting the event simulations. In one application, commanders in abattlefield may study of the morale of soldiers using low level observations of soldiers behaviors.In another application, paramedical personnel may gear themselves up to attend to a potentialdisease outbreak in a geographic area. These case studies will be conducted in our research byevent simulations.

14


16/23

6.0 Expertise of project personnel

6.1 Qualifications of PI K. Ravindran: Professor of Computer Science (City College,CUNY); Ph.D. (Computer Science, 1987, University of British Columbia); research ar-eas: distributed collaboration systems & protocols, information assurance systems, service-level management of network infrastructures; managed external grants/contracts of about$1.1M; has about 90 refereed publications; 17 years of university experience and 5 yearsof experience in space and communication industries.

This PI has studied the on-line monitoring and control paradigms for distributed multimedia net-works and enterprise web server systems. These works involve extensive simulation modeling of theunderlying network and server infrastructures. This expertise will help in the conduct of the AFRL-HE project. The PI has also worked on coarse granular event management services in distributedcollaboration settings. The PI has extensively worked in the area of replica voting algorithms todecide among conflicting results involving deterministic input data partly through summer fel-lowships at the Air Force Research Lab (Rome, NY) during 2001-07. This expertise will be usefulin the application case studies.

6.2 Role of Graduate Students

A 14-month graduate student position has been requested, who will assist in developing the soft-ware and simulation tools needed to implement the data fusion procedures. The graduate studentwill have expertise in MATLAB, XML, UML, and JAVA software tools, which will be useful fordeveloping the DFP software. Besides, the graduate student will also look into the geography andsocial aspects human behavior. This will be useful in conducting case studies for assessing thehuman-effectiveness oriented metrics using our model of data aggregation (such as the morale ofsoldiers when deployed in remote regions where the social demographics and the geography of theregion have a strong impact). Personnel with the the above combination of expertises will be soughtto participate in this project.

7.0 Schedule of milestones

1. Implementation of discrete-event simulation testbed for event aggregation using MATLAB(Aug.08 to Dec.08);

2. Preparation of in-house network testbed at CUNY for experimental studies (Oct.08-Jan.09);

3. Development of GUI for event specification and aggregation (Jan.-April09);

4. Development of SQL and XML schema for event specifications (Jan.-April09);

5. Study of replica voting mechanisms for event fusion (May-July09);

6. Case studies of battlefield applications and human-in-the-loop simulations, in collaborationwith AFRL-HE lab (June-Aug.09);

7. Documentation of project results (June-Sept.09).

15


17/23

A: STATEMENT OF WORK

TASKS/TECHNICAL REQUIREMENTS

The contractor shall accomplish the following:

A.1 Study models for characterizing the external environment behaviors and how they map intospecific data characteristics;

A.2 Develop the distributed control mechansims from the high-level specification of data fusionprocedures;

A.3 Develop algorithms for event filtering at the protocol level and for event propagations at theapplication interface level;

A.4 Develop designer-friendly GUIs for rapid event compositions and visual evaluation in thein-house network testbed at CUNY;

A.5 Develop SQL-based and XML-based schemas for specifying event filters and aggregationrules;

A.6 Carry out application case studies of data aggregation procedures for use in military settings:

i) the analysis of DOS attacks on distributed networks using visual tools, and ii) the study ofdashboard-like visual interfaces for QoS control and the underlying system structures.

The task deliverables A.1-A.6 will be accomplished on a software-level simulation testbed withreal data obtained from AFRL-HE offices.

During the project period, the contractor shall write technical reports outlining the progress ofresearch work for dissemination of the results (both positive and negative) to the technical liaisongroups in the AFRL-HE. To permit full understanding of the techniques and procedures used inevolving the protocol testing technology, the reports will include pertinent observations, nature of

technical problems, design methods used, computer algorithms developed, etc. The contractor shallalso make detailed technical presentations on the progress of work at the above site semi-annuallyduring the project period, and present the completed work (including the demonstration of datafusion software tools and applications) at the end of the project period. In addition, the contractorshall provide a demonstration of the in-progress project works on the CUNY simulation testbedsin about 16 months from the start of the project.

The contractor shall also write research articles and technical papers for publication in journalsand conferences for wider dissemination of the results across academic and industrial researchcommunities in the areas of advanced network architectures. Such publications will carry a citationto acknowledge the AFRL-HE contract in supporting the published work.

References

[1] S. Chamberlain. Automated Information Distribution in Bandwidth-constrainedEnvironments. In Proc. Milcom94, North-Holland pub., 1994.

16


18/23

[2] H. Kopetz and P. Verissmo. Real Time Dependability Concepts. Chap. 16, DistributedSystems, ed. S. Mullender, Addison-Wesley Publ. Co., 1993.

[3] V. K. Garg. Observation of Global Properties in Distributed Systems. In Proc. Intl.Conf. on Software and Knowledge Engineering, IEEE CS, pp.418-425, Lake Tahoe (NV), June1996.

[4] Y. Yao and J. Gehrke. The Cougar Approach to in-network Query processing inSensor Networks. In ACM SIGMOD Record, 2002.

[5] S. Madden, M. Franklin, J. Hellerstein, and W. Hong. TineDB: an Acquisitional QueryProcessing System for Sensor Networks. in ACM-Transactions on Database Systems,2005.

[6] C, Intanagonwiwat, R. Govindan, D. Estrin,, J. Heideman, and F. Silva. Directed Diffusionfor Wireless Sensor Networking. in IEEE/ACM-Transactions on Networking, Febr. 2003.

[7] Sensor Model Language (SensorML). in http://vast.uah.edu//SensorML, 2005.

[8] K. Ravindran and Jun Wu. Programming Models for Behavioral Monitoring of Dis-tributed Information Networks. in proc. Distributed Systems and Real-time Applications,

IEEE-DSRT 2005, Oct.2005.[9] K. Birman and et al. Astrolebe: a publish-subscribed based event processing system.

in Technical reports, Cornell University, 2002-2005.

[10] C. G. Cassandras and S. Lafortune. Introduction to Discrete Event Systems. In Kluweracademic Publishers (Springer), 2007.

[11] R. Stadler, F. Wuhib, M. Dam, and A. Clemm. Decentralized Computation of Threshold-crossing Alerts. In proc. conf. on Distributed Systems: Operations and Management,IEEE/IFIP, Barcelona (Spain), Oct. 2005.

[12] D. G. Stork, R. O. Duda, and P. E. Hart Pattern Recognition Systems. Chapter 1.3, Pattern

Classification, 2000.

[13] W. Hu, A. Misra, and R. Shorey. CAPS: Energy-Efficient Processing of Continuous AggregateQueries in Sensor Networks. In proc. 4th Intl. conf. on Pervasive Computing and Communi-cations, IEEE-PerCom06, pp.190-199, June 2006.

[14] S. Kabadayi, A. Pridgen, and C. Julien. Virtual Sensors: Abstracting Data from PhysicalSensors. In Technical Report 2006-01, University of Texas at Austin, 2006.

[15] K. Stranc. Airborne Networking. In Presentation, MITRE Corporation, Public Release Ref.#04-0941, 2004.

[16] R. Clark, E.D. Jensen, A, Kanevsky, J. Maurer, T. Wheeler, Y. Zhang, D. Wells, T. Lawrence,

and P. Hurley. An Adaptive, Distributed Airborne Tracking System. In proc. IEEE WPDRTS,vol.1586 of LNCS, 1999.

[17] W. J. Gutjahr. Reliability Optimization of Redundant Software with Correlated Failures. Inproc. 9th Intl. Symp. on Software Reliability Engineering, Germany, 1998.

17


19/23

[18] J. Jachner, S. Petrack, E. Darmois, and T. Ozugur. Rich Presence: A New User Communica-tions Experience. In Alcatel Telecommunications Review, Technology White Paper, pp.73-77,1st quarter 2005.

18


20/23

K. Ravindran

Current Position: Professor of Computer ScienceCity University of New York (City College), New York (joined in 1996)

Degrees Received: Ph.D (Computer Science), University of British Columbia, Canada, 1987M.Eng (Computer Science & Automation), Indian Institute of Science, Bangalore, 1B.Eng (Electronics), Indian Institute of Science, 1976.

Work experience: 1. Assistant Professor,Department of Computing & Information Sciences, Kansas State University (1989

2. Member of Scientific Staff,Bell Northern Research, Ottawa, Canada (1988 1989).

3. Assistant Professor,Department of Computer Science, Indian Institute of Science (1988).

4. Teaching/Research Assistant,Department of Computer Science, University of British Columbia (1983 1987).

Research activities: 1. Working on Distributed systems modeling and designfor information security, replication control, QoS assurance)

2. Working on Architectures and protocols forFuture Internet, Network Management, traffic engineering

3. Published about 90 papers in refereed international conferences and journals4. Received research grants/contracts from IBM, Philips Research,US Air Force, ARPA, BMDO during 1992-01 (totaling about $1.1M)

5. Supervised research thesis of 4 Ph.D. and 18 M.S. students (currently supervising 2 Ph.D. thesis works and 2 M.S. thesis works)6. Selected as Senior Summer Faculty Fellow at Naval Research Lab, 20087. Summer Faculty Fellow at Air Force Research Lab, Rome, 2001, 2003, 2005-07

Collaborators: 1. Dr. Kevin A. Kwiat, Air Force Research Laboratory (Rome, NY)(2002-present) Areas: Information Assurance and Security

2. Dr. K. K. Ramakrishnan, AT&T Research (Flohram Park, NJ)

Areas: Rate Control for Video Distribution in Large Multicast Groups3. Prof. D. Kumar, City College of CUNY (New York, NY)

Areas: Design and Validation of Distributed Protocols

19


21/23

Teaching activities: 1. Instructor for graduate Distributed Systems course2. Instructor for graduate level and undergraduate level

operating systems and computer networks courses3. Instructor for undergraduate level Data Structures & Algorithms

20


22/23

Publications relevant to proposal

References

[1] X. Liu, K. Ravindran, and D. Loguinov. A Queuing-theoretic Foundation of AvailableBandwidth Estimation: Single Hop Analysis. accepted for publication in IEEE/ACMTransactions on Networking, June 2006.

[2] K. Ravindran and Jun Wu. Architecture for Dynamic Protocol-level Adaptation forEnhancing Network Service Performance. In proc. IEEE/IFIP Conf. on Network Oper-ations and Management (NOMS06), Vancouver (Canada), April 2006.

[3] X. Liu, K. Ravindran, and D. Loguinov. Towards a Generalized Stochastic Model ofEnd-to-End Packet Pair. accepted for publication in IEEE Journal on Selected Areas inCommunications, March 2006.

[4] A. Sabbir and K. Ravindran. Concurrency Control for Interactive Sharing of DataSpaces for Distributed Real-time Collaborations. in proc. of IEEE Intl. Symp. on

Distributed Simulation and Real-time Applications, Montreal (Canada), Oct. 2005.

[5] K. Ravindran and X. Liu. Service-level Management Frameworks for Adaptive Dis-tributed Network Applications. in Springer Verlag Lecture Notes on Computer Science,(Service Availability), LNCS 3335, 2005.

[6] K. Ravindran, J. P. Fortin, and X. Liu. Flow Management for QoS-controlled DataConnectivity Provisioning. accepted for publication in ElSevier Journal of ComputerCommunications, Febr. 2005.

[7] K. Ravindran. Programming Models for Behavioral Monitoring of Distributed In-formation Networks. in proc. of Hawaii Intl. Conf. on System Sciences, Big Island (HI),

Jan. 2005.

[8] K. Ravindran and A. Sabbir. Event-based Programming Structures for MultimediaInformation Flows. in Springer Verlag Lecture Notes on Computer Science, (Managementof Multimedia Networks and Systems), LNCS 3271, 2004.

[9] K. Ravindran and R. Steinmetz. Object-oriented Communication Structures for Mul-timedia Data Transport. In IEEE Journal on Selected Areas in Communications, SpecialIssue on Distributed Multimedia Systems and Technology, vol.14, no.7, pp.1360-1375, Sept.1996.

[10] K. Ravindran and X. T. Lin. Structural Complexity and execution Efficiency of Dis-

tributed Application Protocols. In Proc. Conf. on Communication Architectures, Protocolsand Applications, ACM SIGCOMM, San Fransisco (CA), pp.160-169, Sept. 1993.

21


23/23

Other significant Publications

References

[1] K. A. Kwiat, K. Ravindran, and P. Hurley. Energy-efficient Replica Voting Mechanismsfor Secure Real-time Embedded Systems. In proc. of Intl. Conf. on World of Wirelessand Mobile Multimedia, WOWMOM05, Taormina (Italy), June 2005.

[2] K. Ravindran, K. A. Kwiat, and A. Sabbir. Adapting Distributed Voting Algorithms forSecure Real-time Embedded Systems. In proc. of workshop on Distributed Auto-adaptiveReconfigurations in Software Systems, ICDCS-DARES04, Tokyo (Japan), March 2004.

[3] K. Ravindran, A. sabbir, D. Loguinov, and G. S. Bloom. Cost-optimal Multicast Treesfor Multi-source Data Flows. In proc. INFOCOM01, IEEE Com. Soc., Anchorage (AK),April 2001.

[4] K. Ravindran, T. J. Gong. Cost Analysis of Multicast Data Transport Architecturesin Multi-service Networks. In IEEE/ACM Transactions on Networking, Feb. 1998.

[5] K. Ravindran, G. Singh, C. M. Woodside. Architectural Concepts for Implementation of

End-systems High Performance Communications. In Intl. Conf. on Network Protocols(ICNP), Columbus (OH), Oct. 1996.

[6] K. Ravindran. Architectures and Protocols for Data Multicasting in Multi-serviceNetworks. In Computer Communications Review, ACM SIGCOMM, July 1996.

[7] K. Ravindran. A Flexible Network Architecture for Data Multicasting in HighSpeed Multi-service Networks . In IEEE Journal of Selected Areas in Communications,Special Issue on Global Internets, Oct. 1995.

[8] K. Ravindran and V. Bansal. Delay Compensation Protocols for Synchronization ofMultimedia Data Streams. In IEEE Transactions on Knowledge and Data Engineering

Special Issue on Multimedia Information Systems, vol. 5, no. 4, pp.574-589, Aug. 1993.

22

rsrch nofig

Documents