challenges in managing uncertainty during cyber events: lessons...

Matthieu Branlat, Alexander Morison, David Woods

Cognitive Systems Engineering Laboratory, The Ohio State University

Challenges in managing uncertainty during cyber events:

Lessons from the staged-world study of a

large-scale adversarial cyber security exercise

ABSTRACT In spite of the recognized challenge and

importance of developing knowledge of the

domain of cyber security, human-centered

research to uncover and address the difficulties

experienced by network defenders is to a large

extent lacking. Such research is needed to

influence the design of the joint systems facing

cyber attacks.

Network defense depends on the capacity to

successfully identify, investigate and respond

to suspicious events on the network. Such

events correspond to fragmented micro

phenomena that occur in a background of

overwhelming amounts of very similar

network activity. One key question in order to

understand and better support cyber defense is:

how can cyber defense systems manage the

high level of uncertainty they face?

Human operators are essential in network

analysis. Network analysts operate within

teams, and within larger organizations. These

are essential dimensions of cyber security that

remain under-researched. They are also at the

heart of challenges typical of joint activity in

complex work systems: tasks conducted by

distinct teams put them at risk of working at

cross-purposes and security goals conflict with

the organization‟s production goals. In

addition, cyber events are fundamentally

adversarial events, a dimension of cyber

security also under-researched.

This paper presents findings from a staged-

world study of a large-scale adversarial cyber

security exercise. It describes the challenges

and management of uncertainty in this context

and discusses their implications for the design

and development of better defense systems.

INTRODUCTION The continuously growing connectivity of

systems creates increasingly complex digital

infrastructures that enable critical and valued

services. This source of performance also

constitutes a source of vulnerability to cyber

threats, a growing concern expressed in

military, financial and industrial domains. In

particular, the potential impact of cyber attacks

on critical infrastructures and services societies

depend on daily is worrying. Industrial control

systems, seldom designed with cyber security

in mind, also exist in a competitive economical

context in which proprietary information

becomes decisive. These characteristics make

industries high-value targets for cyber

terrorism (Finco, Lee, Miller, Tebbe and

Wells, 2007). Importantly, cyber security

experts observe that, at the same time, the

knowledge cost for hackers is getting

considerably lower (Goodall, Lutters and

Komlodi, 2004), especially because of the

large availability of information,

documentation and even ready-to-use software.

On the other hand, cyber defense remains a

highly demanding task. Numerous efforts exist

to improve cyber defense, typically focused on

the search for technological solutions. But in

spite of the recognized challenge and

importance of developing knowledge of this

critical domain, human-centered research to

uncover and address the difficulties

experienced by network defenders is recent

and still limited. Moreover, understanding

cyber security, a fundamentally adversarial

domain, requires investigations of the

interrelated defense and attack processes, but

such studies are rare. While research has

produced models of cyber attack or defense

processes, simultaneous investigations of both

processes do not appear to exist (studies

usually rely more or less explicitly on

hypothesized attacker or defender behavior).

Such research is needed to influence the design

of the joint systems facing cyber attacks.

Common publications about cyber defense are

how-to resources that focus on technological

dimensions of the domain and associated

knowledge and skills (e.g., firewalls and their

management). In this type of literature,

network analysts are expected to follow good

practices in order to ensure network security.

However, other authors recognize that, in spite

of significant technological progress, human

analysts continue to be key elements of

network security. Based in part on cognitive

task analysis methods, detailed accounts of

network defense analysts’ work do exist, but

are largely focused on this single perspective

within the larger context of cybersecurity

(Goodall et al., 2004; D’amico and Whitley,

2008). More recently, publications from a

group of researchers at the University of

British Columbia has described the

collaborative nature of cyber defense and its

processes within the larger organizational

framework (Werlinger, Muldner, Hawkey and

Beznosov, 2010; Hawkey, Muldner and

Beznosov, 2008).

Cyber attacks have been described based on

after-the-fact investigations or expert

interviews. These accounts are informed

interpretations at best, since available data

often are scarce and highly ambiguous. Most

studies have focused on defense relying more

or less explicitly on hypothesized attacker

behavior. A notable exception is Jonsson and

Olovsson’s study (1997) of cyber attack

dynamics (but this study made assumptions

that limited its realism).

The focus of this paper will be primarily on

cyber defense. However, understanding cyber

defense requires considering the dynamics of

cyber attack and of the interplay between

attack and defense. These dynamics will,

therefore, be presented here; they are described

in greater detail elsewhere (Branlat et al.,

2011; Branlat, 2011). Insights from these

processes of cyber security result in directions

for the improvement of cyber defense.

STUDY CONDUCTED The research described in this document stems

from an on-going collaboration between the

Cognitive Systems Engineering Laboratory

(CSEL) at the Ohio State University and the

Idaho National Laboratory (INL). It consists of

a large-scale staged-world study of an

adversarial cyber security exercise. The

exercise was part of a weeklong training

organized by INL for the energy sector and

aimed at raising awareness about cyber

security. Forty people coming from this

industrial domain participated in the exercise,

among which a majority were IT specialists

(with various competences such as database or

network management). Some participants had

pre-existing knowledge in cyber security, but

no participant was an expert in this domain.

The environment of the exercise consisted of

the simulation of a typical industrial facility in

charge of producing some product and relying

on a large network in order to control the

production of the physical process. During the

12-hour competitive Red vs. Blue exercise, the

Red team attempted to take advantage of the

openness of the network in order to attack it

from the outside and, ultimately, perturb the

process. On the other hand, the Blue team

operated within the organization and was in

charge of protecting the network and,

ultimately, of maintaining the production.

Figure 1 – General exercise environment

The study conducted is a staged-world study

(Woods and Hollnagel, 2006, Chap. 5) based

on a found scenario designed by domain

experts from INL. It actually does not rely on a

typical scripted scenario to provide interesting

learning situations for participants. It rather

consists of a high validity exercise

environment (configuration of network and

assets, production and organizational

environment) in which the activity of the Red

team serves as the main pacer and source of

perturbations for the Blue team.

Data capture involved 4 observers with

relevant knowledge background (2 Cognitive

System Engineers with computer science

background and 2 Human Factors specialists

with energy sector expertise). Observers were

distributed in the various physical spaces to

capture teams‟ activities through hand notes.

They were supplemented by various fixed and

targeted audio/video recording devices.

Analyses implemented a process-tracing

methodology (Woods, 1993) based on the

transcription of physical behavior and verbal

communications recorded on each side.

Preparatory work based on domain literature

initially informed the study and its

methodology by identifying domain

characteristics. These anticipated

characteristics also allowed for the

identification of relevant literature, such as that

related to distributed anomaly response or

adversarial interplay. The relevant literature

provided a theoretical framework for the

various phases of the study.

CYBER ATTACK The general goal of the Red team was to

defame, perform reconnaissance, invade, and

eventually break or destroy the physical

process defended. The essential goal of attack

is to connect their network to the blue team

network with sufficient permission (e.g., root

access) with the end result being the ability to

modify the internal network, processes, and

external connections.

The recurring pattern of attack observed during

the exercise is the following:

through some intelligence phase, the team

discovers a vulnerability on the network,

members attempt to exploit it,

one member gets access,

access can be used to: conduct further

intelligence, get further access, and

compromise host to do damage or secure

presence (e.g., modify system‟s settings).

eventually, access is lost, often

unintentionally (e.g., a bad operation kills

the connection) or as a result of Blue team‟s

actions (e.g., the host is rebooted).

Cyber attack is characterized by path

dependency: potential actions and access

depend on what a location affords due to the

configuration of the networks and their assets

(e.g., access to machine A will afford different

actions and further access than access to

machine B). The pattern of attack presented

above risks giving a false sense of incremental

advances towards a well-defined goal and

through a pre-identified general path. The

general process is more akin to a laborious

exploration in the dark. This corresponds to a

type of behavior that is opportunistic and based

on trial-and-error, not entirely planful. This

pattern of attack behavior relates to the

difficulty of maintaining strategic planning

while focusing on tactical or technical

challenges. Also, the adversarial environment

is the source of trade-offs between efficiency

and exposure to detection while choosing

courses of actions. Core goals of the attack,

therefore, include avoiding detection.

Disturbances to their progress (e.g., machine

they had compromised is rebooted) are

perceived as indications of detection and

actions by the defense. They create sources of

urge to “do something” before all access is

completely lost. This type of pressure creates

further threats to their strategic goals as

compromising actions are typically more

detectable and risk revealing their presence.

Progressing in the network through various

actions, therefore, implies the risk of being

detected and denied opportunities to conduct

the attacking plan. On the other hand, because

of the nature of networks and network assets,

and because of the need for networks to be

partially opened in order to provide valuable

services, there are always opportunities for

attackers to get in somewhere.

CYBER DEFENSE Defending a valuable digital infrastructure

requires pursuing two interrelated goals:

maintaining production while preventing

hackers from gaining access and from acting

on the network (e.g., stealing or corrupting

data, interfering with process production).

First, despite the potential disruptions, the

team needs to maintain the process that

supports the organization‟s production and

service activities. Second, the defense team

needs to provide security on their local

network. This involves preventing illegitimate

activity by removing potential vulnerabilities

and stopping illegitimate access when

discovered.

While monitoring network activity, the central

question for the defense team is: Is the activity

observed legitimate or illegitimate? Faced with

high amounts of traces of activity on the

network, the defense team needs to distinguish

valid traffic from traces that are indicative of

attack processes. Answering this deceptively

simple question requires the ability to detect

suspicious traces of activity, as well as to

implement investigation processes that aim at

validating their nature. Characteristics of

network traffic (type, source, target, or other

aspects such as volume of exchanges) define

what might be considered suspicious activity.

More elaborate investigations consider the

relationships between the characteristics of

activity observed, i.e., about the type of

activity in relation to its source and/or target.

These investigations aim at addressing whether

the activity is to be expected in the context of

the network and its assets. Unfortunately,

characteristics of the cyber security domain

make processes of sensemaking (Klein, Moon

and Hoffman, 2006) particularly challenging.

These characteristics are various forms of

uncertainty associated with network activity:

Defenders cannot observe attacking actions

directly, but only through traces available

on the network.

Elementary traces of activity are often

ambiguous, i.e., the same traces can mean

different things.

Furthermore, these traces are incomplete

accounts of Red‟s behavior, especially

since Blue mostly has access to data

presented by technological systems that

filter and interpret basic network activity.

Meaningful units of attack activity are also

scattered: they are based on multiple micro-

events that require different perspectives to

be analyzed as a whole (a process labeled

correlation in the literature).

Finally, these challenges exist in an

incompletely known and evolving

environment, where knowledge is key to

understanding traces observed.

Once events are identified with sufficient

understanding and certainty, response actions

intend to correct vulnerabilities revealed by the

attacks, and ultimately at impairing their

progression. However, this phase of anomaly

response (Woods and Hollnagel, 2006, Chap.

8) is where the defense team is confronted with

the difficulty to provide security within a

production environment. Providing network

security while maintaining production is a

fundamental trade-off that defenders have to

manage. Essentially, this trade-off is about

network access. From the perspective of

providing security, characteristics of network

configuration such as open ports, weak

external firewall rules, susceptible services,

and unprotected network paths between

internal sub-networks, represent vulnerabilities

that need to be eliminated. However, from the

perspective of maintaining production, these

same network properties facilitate network-

based activities. In the context of this trade-off,

network defense is faced with difficulties to

provide evidence of attack. The production of

evidence suffers from the same early detection

(“early warning”) problems as other domains

(Woods, 2009): evidence is often ambiguous

and uncertain when sought early and

proactively; it is usually clearer after the fact,

once an adverse event has occurred and can be

traced back to earlier events. What constitutes

evidence becomes a subject of debate and

negotiation because of the impact of

corresponding measures on production goals.

Defense processes require ample knowledge

about the network configuration and assets.

They are also highly collaborative, therefore,

require knowledge and understanding of

actions conducted by other team members. For

instance, patching one machine (to reduce

vulnerability) generates temporary unusual

(i.e., suspicious) traffic while services are

restarted.

INTERPLAY No single perspective on the events can

capture both sides simultaneously. A cyber

event is nonetheless fundamentally adversarial

and therefore needs to be analyzed through the

interplay of both sides‟ decisions and actions.

The interplay plays out in the network itself,

which can be seen generically as a highly

organized medium that connects clients and

providers of services. Network activity

becomes the central frame of reference to

study actors‟ decisions and actions.

The following figures adopt a network-centric

view to contrast the attack and defense

perspectives. This contrast aims at highlighting

important similarities and differences.

Figure 2 – Attack perspective

The figures are simplified representations of

the processes the Red and Blue teams engaged

in throughout the exercise observed. For each

perspective, a number of their actions represent

sources of network activity (represented by the

arrows pointing to the types of network

activity). Also, each side operates with the

knowledge that some other actors exist on the

network, which are an important part of their

own activity but of which they have limited

knowledge or observability (these other actors

are presented by the gray elements in the

figures).

Figure 3 – Defense perspective

Cyber attackers and defenders share the same

technological environment – the network and

its applications – and, to a large extent, require

very similar competences and frames of mind.

In addition, some of the primary tasks that both

sides conduct are the same, for instance:

identifying and understanding network

vulnerabilities and vulnerable machines;

developing and maintaining sufficient

knowledge of the network; understanding

network activity. Through similar tools and

actions, both sides generate similar-looking

activity traces on the network.

Interestingly, both sides are conscious of the

other side‟s presence, and act accordingly. The

way in which they conduct their activity

integrates the threat that the other side‟s

actions represent to their own mission: the

attacking team knows the network is

monitored, and the defending team knows the

types of actions attackers are trying to

implement. However, neither side is in a

position to actually observe the other directly.

Each side therefore relies on what can be

observed or experienced on the network. Such

information allows them to infer the other‟s

behavior and take corresponding measures

(e.g., adapt their plan). Data from the exercise

show that, on both sides, inferences can be

correct or not. Inferences are in fact often

incorrect. The tendency on both sides seems to

be to interpret unexplained adverse events as

results of the adversary‟s actions.

Analysts on the defensive side monitor

network activity to detect anomalous traffic.

They are essentially concerned with being able

to distinguish between legitimate and

illegitimate activity. However, due to scale and

low observability, they have necessarily

limited and/or fragmented knowledge of the

potential sources of activity. To develop

knowledge of their network, they use both

active ways to probe the network and passive

ways to monitor the traffic. Active probing

constitutes another source of activity, one that,

in addition, appears quite similar to hacking –

a source of interaction and goal conflicts. As

attackers probe the network with similar

traffic, using similar tools in the same space, it

becomes difficult for network analysts to

efficiently sort the traces generated, and their

own activity risks providing a mask to the

traffic they want to detect. The main tactic

used during the exercise observed to recognize

suspicious exploratory traffic is the

identification of the source IP addresses. IP-

based recognition nonetheless represents a

brittle mechanism: it assumes an extensive

low-level knowledge of the network as well as

relative network stability; and it relies heavily

on human memory and string recognition for a

type of information that is probably not the

most conducive to these cognitive processes.

Defenders experience a challenging situation

of data overload where few mechanisms

efficiently support the necessary organization

of network traffic, context sensitivity, and

control of attention (i.e., focusing and

reorienting).

Attack and defense participate in a

fundamentally asymmetric relationship. From

the perspective of the defense, the main task is

to make sense of the attacking team‟s

behavior. From the perspective of the attacking

team, this is an important but secondary

objective. Attackers are indeed more focused

on understanding the unknown environment so

that they can make progress. In a sense, the

other side‟s perception underlies their activity

as well, but more as a potential for hindering

their progress. In addition, both teams are not

equal when faced with inadequate knowledge

or actions. The defending team cannot afford

to have an approximate understanding of the

situation or an inadequate response

implementation. On the other hand, the

attacking team will have numerous

opportunities to do damage. The

implementation of a pre-defined strategy is

what can be difficult for them, but their

process is more opportunistic by nature.

One of the characteristics that emerge from the

observations is the frequent delay between the

initial events and their detection. If the delays

correspond to actual latencies in the detection

process, this suggests that an attacking team

commonly has a window of opportunity during

which they can implement various actions

before being detected and disconnected. On the

other hand, attackers‟ progression is hindered

by the fact that they are „fumbling in the dark‟,

i.e., access gained to a particular machine does

not translate immediately into further access to

more sensitive data or assets. The intelligence

or compromising actions attackers are required

to perform to build further knowledge or

secure access risks uncovering them. This

means that defenders also have a window of

opportunity to act during which more signs of

attackers‟ presence become available. The

situation therefore resembles a chasing game

between the teams, attackers being most often

in a position to set the pace.

TOWARDS RESILIENT CYBER

DEFENSE

A control problem Since disturbances will occur that challenge

the way systems normally operate, it becomes

necessary to think of organizations seeking

network security as adaptive systems. Because

of the scale and high degree of functional

interdependencies (in the network

configuration or in the distribution of tasks),

such organizations are also complex systems.

As complex adaptive systems, they need to

manage trade-offs in the face of uncertainty,

complexity and production pressures.

Difficulties in the management of these trade-

offs risk exposing the organizations to the

three basic patterns identified in domains

sharing similar core characteristics (Woods

and Branlat, 2011):

They need to adapt so as to keep up with

the pace of events.

They need to adapt while managing

interdependencies and avoiding working at

cross-purposes.

They need to modify their response

strategies when these prove ineffective.

These basic patterns define high-level goals

that represent what it means to “be in control”

(Woods and Branlat, 2010) in the domain of

cyber defense. Based on the description of the

activities on the attacking and defensive sides

and of their interplay, and in order to avoid the

three basic patterns of adaptive failure, being

in control means:

anticipating how adverse events may

evolve in order to take advantage of the

window of opportunity on the defensive

side,

understanding and managing the impact of

adverse events on the system (network,

production), and

understanding and managing the impact of

the response on production goals

From the understanding of what in means to be

in control, it is possible to discuss ways to

amplify control for cyber defense. Resilience

Engineering emphasizes how resilient control

is related to adaptive capacity and its

management (Woods and Branlat, 2010,

2011). Amplifying control essentially means

transforming systems so as to help them avoid

“failure[s] to adapt or adaptations that fail”

(Dekker, 2003), i.e., situations where

adaptations are not successful, either because

systems fail to recognize the need for

adaptation, or because the adaptive processes

themselves produce undesired consequences.

This section will explore potential directions of

investigation and development to support

cyber defense in avoiding maladaptive patterns

based on principles underlying resilience

(Hollnagel, Woods and Leveson, 2006; Woods

and Branlat, 2011). These directions are

ultimately related to the general problem of

managing uncertainty.

Sensemaking, anticipation, adversarial

interplay: adapting in time In the observations conducted, the detection of

elementary and potentially suspicious traces of

activity does not seem to be the main problem,

apart from the latency mentioned above. The

bigger issues are determining what actually

happens, i.e., what it means in terms of

purposeful actions perpetrated by the attacking

team. Detecting anomalies from its dispersed

symptoms does not equate correctly adding up

elements gathered separately (Klein, Pliske,

Crandall and Woods, 2005). Each trace of

activity in itself, taken in isolation, is

ambiguous or insufficient in order to infer the

general problem. Isolated traces nonetheless

raise suspicion. A set of traces corresponds to a

pattern relevant to the domain of work and

recognized by the expert practitioner (it is

more difficult for the novice). Anomaly

detection results from the construction of

meaning through a “mental model” that builds

on initial cues, guides further actions and

evolves as more indications become available.

A computer network and its activity are prime

examples of complex phenomena that cannot

be observed from a single (all-encompassing)

perspective without risking committing

oversimplifications (Smith, Branlat, Stephens

and Woods, 2008). Multiple perspectives are

necessary to provide the diverse conceptual

views (structural and functional properties,

representations) of the multifaceted work

situation. In other words, the different roles

engaged in cyber defense are not all interested

in and focused on the same aspects of the

situation. The different perspectives on the

network represent different ways to

characterize the information that a defending

team might need to acquire during an event to

conduct their operations. Since multiple

perspectives cannot simply add to one another

and can even conflict (e.g., physical location

vs. logical organization of network assets,

highly detailed view vs. global picture), it

becomes necessary to have multiple

representations for the various ways to

visualize and seek information in the network.

These representations can then constitute a

space that needs to be organized in order to

facilitate a meaningful navigation, i.e.,

coherent transitions between perspectives

(ibid.).

Supporting cyber defense: Multiple

perspectives are required to efficiently make

sense of situations at hand. These perspectives

are defined by the type of competences

required to understand particular aspects of the

situation (e.g., network or database activity),

but also by their more tactical or strategic

focus. Supporting cyber defense then means

supporting each of these perspectives as well

as their interaction.

Anticipation is the fundamental projective

dimension of human cognition, and a

characteristic of expert behavior. For

anticipation to be accurate, it requires a

sufficient understanding of the situation at

hand. As a form of feed-forward control,

anticipation is especially critical to keep with

the pace of events, by avoiding failures to

adapt in time, before perturbations cascade. In

the adversarial context of cyber security,

anticipation means understanding and making

projections regarding what the other side is

targeting. In the context of cyber defense, this

includes:

vulnerabilities, especially described in

terms of targets and paths,

patterns of attack, in order to foresee

particular elements of network activity (and

validate the current mental model of the

situation), and

the attackers‟ intent, in order to identify

their plan and specific targets of interest.

Anticipation helps focus attention on the

elements associated with the vulnerability

path, especially for operators in charge of

monitoring network activity.

Supporting cyber defense: The notion of path,

whether a path actually taken by attackers or a

potential path based on their current access, as

well as network‟s connectivity and

vulnerabilities, constitutes an interesting

leverage point to support anticipation in cyber

defense. In order to support anticipatory

processes, cyber defenders benefit from means

to navigate the network in order to identify

paths and capture those relevant to the

situations at hand.

Cyber security is an adversarial domain, which

implies that participants are engaged in co-

adaptive processes based on their perception of

one another. Several aspects of the perceptual

processes are particularly noteworthy:

Adaptations will be based on the inferences

made, whether they are accurate or not (see

Trent, Smith, Zelik, Grossman, and Woods,

2009 in intelligence analysis).

Especially when there is uncertainty about

the other actors, their behavior will be

interpreted in the light of stereotypes

associated with the group(s) they seem to

belong to (a result long described in social

psychology through attribution theory).

The understanding constructed is transient

and dynamic, and orients expectations and

information seeking mechanisms.

The situation can be described from a defense

perspective as a control problem where a

primary goal is to avoid being outpaced by

events. If the attacking team is given ample

time without being hindered in their progress,

it will be able to conduct a variety of actions,

multiplying its opportunities to establish

connections or compromise assets. Falling

behind the curve therefore means that adverse

events can grow exponentially into a cascade

of disturbances that will spread thin defensive

efforts and resources. Observations support the

idea that the perception of the attack‟s

intention, because of its anticipatory nature, is

a central element of network defense. The

asymmetry described above between attack

and defense appears as a source of greater

challenges for the defensive team. That being

said, it might due to the fact that defense was

primarily reactive during the exercise

observed. If a team adopts a more elaborate

adversarial defense strategy, it seems likely

that the attacking team would face equivalent

challenges that may hinder its progression.

Supporting cyber defense: The attacking team

is “fumbling about in the dark”, a

characteristic of their process that impairs their

pursuit of strategic goals. An adversarial

defensive strategy could consist of actively

probing sources of suspicious activity. By

giving the attacking team the impression that

they have been detected, network defense

would put them under pressure to act, thereby

forcing them to reveal more of their presence

(or sacrifice it entirely). From the perspective

of Signal Detection Theory, this corresponds to

making weak signals standout more relative to

the „noise‟ of the environment (separate further

Noise and Signal+Noise curves), thereby

reducing uncertainty. Such a strategy takes

advantage of the understanding of typical

challenges of attack and toughens their trade-

offs. One difficulty is to ensure that such a

strategy does not compromise legitimate

network usage, including network defense

itself (e.g., create confusion from ambiguous

for the network analysts).

Impact of event: adapting in a complex

environment An important aspect of cyber defense is what

exactly a cyber event (real or perceived as

such) means in terms of threats to the mission

and requirements for the response. The

complexity with which the defensive team

needs to cope is two-fold: it exists in the

network itself, i.e., in the operational

environment, but also in the response system,

i.e., within the defensive team and organization

it belongs to.

The general progression process of the

attacking team was described previously:

access gained to a machine is a source of new

opportunities to conduct intelligence or

disruptive actions and to obtain further access.

The numerous relationships between assets on

the network, therefore, risk creating the

conditions for a cascade of disturbances. On

the other hand, these relationships are defined

by network configuration and are part of the

knowledge the defensive team possesses and

maintains of the network. They correspond to

path dependencies that can be utilized by

defense to understand, anticipate and/or hinder

the attack‟s progress.

A previous section describes how cyber

defense is distributed and how this creates

challenges for successfully coordinating

operations among the team. However, authors

note that neither guidelines, nor technology

appropriately support the highly collaborative

nature of cyber defense (Werlinger et al.,

2010). Part of the response to these challenges

traditionally lies in the expertise of

practitioners and such capacity is expected

(often implicitly) by the organizations to which

they belong. The management of functional

interdependencies is a central issue identified

by the framework of Resilience Engineering.

One of the issues highlighted by the exercise

concerns the scale of response to events, where

„response‟ is understood widely from the

detection phase to the actual response. The

central question is: what constitutes an

appropriate unit of adaptive behavior, i.e.,

what role(s) should be implicated in the

response when an event occurs? Team

members might be involved because:

they need to participate in the response

because of their particular role and form of

competence: to understand the nature of the

event (e.g., correlation) or to act upon it,

functional interdependencies require that

they participate in a coordinated response,

e.g., to devise a common plan, or

they simply need to be informed: the event

is of interest for their role; their tasks are

related to actions planned and will therefore

experience effects from the response.

Different types of events correspond to

different demands in terms of scale of

response, for which there is an appropriate

match. The appropriateness of this match can

be discussed in relation to the maladaptive

patterns identified by Woods and Branlat

(2011).

If the scale is too small, i.e., members of

the team that should have been involved do

not participate in the response:

uncoordinated parts of the system risk

working at cross-purposes, and larger

phenomena might be missed, creating risks

to adapt inappropriately or undermine the

need to adapt (risk of stale adaptive

processes).

If the scale is too large, and all members

participating in the response did not need to

be involved, resources are unnecessarily

committed and the system is slowed down

by higher costs of coordination (risk of

falling behind the tempo if new

disturbances arise).

Supporting cyber defense: The scale of

response is an important determinant of the

activity, and needs to be managed. It is an

indicator of resilient or brittle adaptive process.

Systems built around sharing all information

with every single role commit a fallacy relative

to this issue: they assume that, since people

have (technical) access to information, they

will see it and recognize that they are

concerned by an event (even when it is

occurring outside of the regular boundaries of

their role). Rather than relying on agents

directly experiencing the event, it puts the

burden of the management of scale on the

external agents and risks putting them in a

situation of data overload. A more supportive

approach would consist of highlighting

interdependencies between roles, thus making

them more visible. This would help agents

understand the impact of their actions on

functionally related roles.

Throughout the exercise, the Blue team tried to

implement changes they thought were needed

in order to respond to events they perceived.

Due to the design of the exercise, they needed

to produce „request for change‟ forms that

were transmitted, along with the evidence they

had gathered, to a White cell, which

represented the network owners. Almost

systematically, changes were not implemented

and the rejection was motivated by lack of

sufficient evidence. In addition, such responses

typically arrived after significant amounts of

time had passed. During the process, mid-level

management in the Blue team was busy

transmitting requests and responses. They

quickly experienced a workload bottleneck,

when this became their primary task. This

situation led them to abandon their roles as

supervisors, who were supposed to keep track

of the team‟s progress and difficulties. The

bottleneck illustrates issues associated with

purely hierarchical control structures:

operators more directly in contact with the

controlled process lack authority and

autonomy, and the required transmission of

information between layers of the system is

inefficient and is a source of bottlenecks.

Because of the limited window of opportunity

the defensive team has in order to act upon

detected adverse events, it is important that the

decision making process occur without delay.

Similarly to other domains related to

emergency response, cyber defense would

benefit from implementing polycentric control

architectures (Ostrom, 1999; Woods and

Branlat, 2010). This research emphasizes that

lower echelons, through their more specific set

of competences and direct contact with the

controlled process than remote managers,

develop a much finer knowledge of the process

behavior. This knowledge allows them to

detect anomalies early, thereby making them

more able to adjust their actions to meet

security or safety goals. That being said, both

purely centralized and decentralized

approaches are likely to fail (Andersson and

Ostrom, 2008); they simply fall victim to

different forms of adaptive challenges. In the

domain of cyber security, in particular,

systematically reducing vulnerabilities

identified is not a viable strategy since it

threatens other, production-related, goals. Such

decisions therefore need to result from

negotiations that confront security and

production goals.

Discussions with cyber security experts

revealed how the management of the security

vs. production trade-off can be complicated by

factors that are outside of the sole context of

the event, and even counterintuitive. In some

situations, security goals are purposefully

abandoned to meet larger objectives. For

instance, from the perspective of the CERT

(Computer Emergency Readiness Team),

situations exist where organizations maintain

open access in spite of their knowledge of on-

going attacks. When attacks are unusual,

sacrifices are made to allow more elaborate

forensics and investigations to be conducted

(e.g., by FBI or others) in order to learn from

the events (e.g., about the attackers or about an

innovative strategy). The knowledge produced

serves the longer-term security goals of a

larger community rather than an effective

response to the unique events experienced.

More commonly, the idea of systematically

patching systems in the face of threats is far

from obvious or convenient in actual

production settings. Organizations typically

use custom-designed applications to fulfill

their particular needs. And often, these tailored

applications have been developed on specific

platforms at a given time and have hardly

evolved since. Patching or updating the

underlying platforms would risk preventing the

applications from working correctly. For

service purposes, organizations knowingly

accept vulnerabilities associated with older

platforms for which fixes exist. Since the time

to actually address these vulnerabilities is

typically long (months or years are required to

develop new versions of the custom

applications), the cost of disruptions due to

improper security is perceived as smaller than

the cost associated with the disruption of

services.

Supporting cyber defense: Cyber defense

requires the implementation of a polycentric

decision architecture. Such an architecture

empowers layers of the system in direct

contact with the controlled process while more

distant layers are in charge of both monitoring

the evolution of the situation and the

coordination of operations. In particular, the

management of trade-offs between security

and production goals needs to be the result of

negotiation between these perspectives.

Because of the complex nature of these trade-

offs, negotiation processes need to be more

direct and better supported, not simply through

exchanges of information along the

management line.

CONCLUSION Domains involving adversarial dynamics can

be essentially competitive (e.g., military

operations, games like chess, or cyber security)

or occur in mixed cooperative-competitive

environments (e.g., driving, board or card

games). In all cases, activities exist in the

context of their interplay, and the interplay

cannot be understood by focusing on its

isolated parts (e.g., solely considering the

„anomaly response‟ side of cyber defense). The

interplay corresponds to continuous processes

of co-adaptation transforming the system in

ways that create both challenges and

opportunities for the adversary. The study

described here is an exploration of the domain

of cyber security from a human-centered

perspective. Research studying cyber security

as a whole is lacking. In spite its limitations

(see Branlat, 2011), the study emphasizes core

characteristics of the domain that are under-

represented in the literature. Cyber security is

adversarial, highly collaborative, and occurs in

an operational environment where it is not the

first priority, but a highly desired feature of a

larger system pursuing production goals. These

core characteristics are especially important on

the defensive side. Consideration of these

dimensions is needed in order to develop

further knowledge of the domain, and design

and conduct future studies. Overlooking or

oversimplifying them risks undermining

results obtained.

REFERENCES Andersson, K., & Ostrom, E. (2008). Analyzing

decentralized resource regimes from a

polycentric perspective. Policy Sciences, 41(1),

71-93.

Branlat, M. (2011). Challenges to Adversarial

Interplay Under High Uncertainty: Staged-

World Study of a Cyber Security Event (PhD

Dissertation). Ohio State University, Columbus,

OH.

Branlat, M., Morison, A. M., Finco, G. J., Gertman,

D. I., Le Blanc, K., & Woods, D. D. (2011). A

study of adversarial interplay in a cybersecurity

event. In S. M. Fiore & M. Harper-Sciarini

(Eds.), Proceedings of the 10th International

Conference on Naturalistic Decision Making

(NDM 2011). May 31st to June 3rd, 2011,

Orlando, FL. Orlando, FL: University of Central

Florida.

D‟Amico, A., & Whitley, K. (2008). The Real

Work of Computer Network Defense Analysts.

In VizSEC 2007: Proceedings of the Workshop

on Visualization for Computer Security.

Springer-Verlag, Sacramento, CA.

Dekker, S. (2003). Failure to adapt or adaptations

that fail: contrasting models on procedures and

safety. Applied Ergonomics, 34(3), 233-238.

Finco, G., Lee, K., Miller, G., Tebbe, J., & Wells,

R. (2007). Cyber Security Procurement

Language for Control Systems Version 1.6. INL

Critical Infrastructure Protection/Resilience

Center, Idaho Falls, USA.

Goodall, J. R., Lutters, W. G., & Komlodi, A.

(2004). I know my network. In Proceedings of

the 2004 ACM conference on Computer

supported cooperative work - CSCW '04 (p.

342). Presented at the 2004 ACM conference,

Chicago, Illinois, USA.

Hawkey, K., Muldner, K., & Beznosov, K. (2008).

Searching for the Right Fit: Balancing IT

Security Management Model Trade-Offs. IEEE

Internet Computing, 12(3), 22-30.

Hollnagel, E., Woods, D. D., & Leveson, N. (Eds.).

(2006). Resilience Engineering: Concepts and

Precepts. Adelshot, UK: Ashgate.

Jonsson, E., & Olovsson, T. (1997). A Quantitative

Model of the Security Intrusion Process Based

on Attacker Behavior. IEEE Transactions on

Software Engineering, 23, 235–245.

Klein, G., Moon, B., & Hoffman, R. R. (2006).

Making Sense of Sensemaking 2: A

Macrocognitive Model. Intelligent Systems,

IEEE, 21(5), 88-92.

Klein, G., Pliske, R., Crandall, B., & Woods, D. D.

(2005). Problem detection. Cognition,

Technology & Work, 7(1), 14-28.

Ostrom, E. (1999). Coping with Tragedies of the

Commons. Annual Reviews in Political Science,

2(1), 493-535.

Smith, M. W., Branlat, M., Stephens, R. J., &

Woods, D. D. (2008). Collaboration Support Via

Analysis of Factions. NATO RTO HFM-142

Symposium on Adaptability in Coalition

Teamwork, Copenhagen, Denmark, 21-23 April

2008.

Trent, S. A., Smith, M. W., Zelik, D., Grossman, J.,

& Woods, D. D. (2009). Reading Intent and

Other Cognitive Challenges in Intelligence

Analysis. In R. McDermott & L. Allender (Eds.),

Advanced Decision Architectures for the

Warfigher: Foundations and Technology (pp.

307-321). Partners of the Army Research

Laboratory Advanced Decision Architectures

Collaborative Technology Alliance.

Werlinger, R., Muldner, K., Hawkey, K., &

Beznosov, K. (2010). Preparation, detection, and

analysis: the diagnostic work of IT security

incident response. Information Management &

Computer Security, 18(1), 26-42.

Woods, D. D. (1993). Process-tracing methods for

the study of cognition outside of the

experimental psychology laboratory. In G. A.

Klein, J. Orasanu, R. Calderwood, & C. E.

Zsambock (Eds.), Decision making in action:

Models and methods (pp. 228-251). Norwood,

N.J.: Ablex Publishing Corporation.

Woods, D. D. (2009). Escaping failures of

foresight. Safety Science, 47(4), 498-501.

Woods, D. D., & Branlat, M. (2010). Hollnagel‟s

test: being „in control‟ of highly interdependent

multi-layered networked systems. Cognition,

Technology & Work, 12(2), 95-101.

Woods, D. D., & Branlat, M. (2011). Basic Patterns

in How Adaptive Systems Fail. In E. Hollnagel,

J. Pariès, D. D. Woods, & J. Wreathall (Eds.),

Resilience Engineering in Practice (pp. 127-

144). Farnham, UK: Ashgate.

Woods, D. D., & Hollnagel, E. (2006). Joint

Cognitive Systems: Patterns in Cognitive

Systems Engineering. Boca Raton, FL: Taylor &

Francis/CRC Press.

Matthieu Branlat is a Research Assistant at

the Ohio State University, in the Cognitive

Systems Engineering Lab. Through the study

of socio-technical work environments, his

research interests include resilience

engineering, system safety, decision making

and collaborative work. Recent projects are

conducted in domains such as cyber security

and intelligence analysis, urban firefighting

and disaster management, medical care and

patient safety.

Alexander Morison is a Research Scientist in

the Integrated Systems Engineering

Department at the Ohio State University

studying the growing challenge of coupling

human observers to remote sensor systems.

Inspired by models of human perception and

attention, he has invented solutions to the

image overload, keyhole effect, and multiple

feeds problems associated with layered sensing

systems and mobile sensor platforms.

David Woods is a professor at the Ohio State

University, and the co-director of the

Cognitive Systems Engineering Lab. From his

initial work following the Three Mile Island

accident in nuclear power, to studies of

coordination breakdowns between people and

automation in aviation accidents, to his role in

founding and developing the Resilience

Engineering field, he has studied how human

and team cognition contributes to success and

failure in complex, high risk systems.

challenges in managing uncertainty during cyber events: lessons...

Documents