radams: resilient and adaptive alert and attention

13
1 RADAMS: Resilient and Adaptive Alert and Attention Management Strategy against Informational Denial-of-Service (IDoS) Attacks Linan Huang, Student Member, IEEE, and Quanyan Zhu, Senior Member, IEEE Abstract—Attacks exploiting human attentional vulnerability have posed severe threats to cybersecurity. In this work, we identify and formally define a new type of proactive attentional attacks called Informational Denial-of-Service (IDoS) attacks that generate a large volume of feint attacks to overload human operators and hide real attacks among feints. We incorporate human factors (e.g., levels of expertise, stress, and efficiency) and empirical results (e.g., the Yerkes–Dodson law and the sunk cost fallacy) to model the operators’ attention dynamics and their decision-making processes along with the real-time alert monitoring and inspection. To assist human operators in timely and accurately dismissing the feints and escalating the real attacks, we develop a Resilient and Adaptive Data-driven alert and Attention Management Strategy (RADAMS) that de- emphasizes alerts selectively based on the alerts’ observable features. RADAMS uses reinforcement learning to achieve a customized and transferable design for various human opera- tors and evolving IDoS attacks. The integrated modeling and theoretical analysis lead to the Product Principle of Attention (PPoA), fundamental limits, and the tradeoff among crucial human and economic factors. Experimental results corroborate that the proposed strategy outperforms the default strategy and can reduce the IDoS risk by as much as 20%. Besides, the strategy is resilient to large variations of costs, attack frequencies, and human attention capacities. We have recognized interesting phe- nomena such as attentional risk equivalency, attacker’s dilemma, and the half-truth optimal attack strategy. Index Terms—Human attention vulnerability, feint attacks, reinforcement learning, risk analysis, cognitive load, alert fatigue. I. I NTRODUCTION H UMAN vulnerability and human-induced security threats have been a long-standing and fast-growing prob- lem for the security of Industrial Control Systems (ICSs). According to Verizon [2], 85% data breach involves human errors. Attentional vulnerability is one of the representative hu- man vulnerabilities. Adversaries have exploited human inatten- tion to launch social engineering attacks and phishing attacks toward employees and users. According to the report [3], 29% of employees fall for a phishing scam and 36% of employees send a misdirected email owing to lack of attention. These attentional attacks are reactive as they exploit the existing human attention patterns. On the contrary, proactive attentional attacks can strategically change the attention pattern of a L. Huang and Q. Zhu are with the Department of Electrical and Com- puter Engineering, New York University, Brooklyn, NY, 11201, USA. E- mail:{lh2328,qz494}@nyu.edu A preliminary version of this work [1] was presented at the 12th Conference on Decision and Game Theory for Security. human operator or a network administrator. For example, an attacker can launch feint attacks to trigger a large volume of alerts and overload the human operators so that operators fail to inspect the alert associated with real attacks [4]. We refer to this new type of attacks as the Informational Denial- of-Service (IDoS) attacks, which aim to deplete the limited attention resources of human operators to prevent them from accurate detection and timely defense. IDoS attacks bring significant security challenges to ICSs for the following reasons. First, alert fatigue has already been a serious problem in the age of infobesity with terabytes of unprocessed data or manipulated information. IDoS attacks exacerbate the problem by generating feints to intentionally increase the percentage of false-positive alerts. Second, IDoS attacks directly target the human operators and security an- alysts in the Security Operations Center (SOC). Third, as ICSs become increasingly complicated and time-critical, the human operators require higher expertise levels to understand the domain information and detect feints. Finally, since human operators behave differently and IDoS attacks are a broad class of adaptive attacks, it is challenging (yet highly desirable) to develop a customized and resilient defense. There is an apparent need to understand this class of proactive attentional attacks, quantify its consequences and risks, and develop associated mitigation strategies. To this end, we establish a holistic model of IDoS attacks, alert generations, and the human operators’ monitoring, in- spections, and decision-making of alerts. In the IDoS attack model, we adopt a Markov renewal process to characterize the sequential arrival of feints and real attacks that target different ICS components. We use revelation probability to abstract the alert generation and triage process of existing detection systems. The revelation probability maps the attacks’ hidden types and targets stochastically to the associated alerts’ observable category labels. To model the human operators’ attention dynamics and alert responses under the IDoS attacks, we directly incorporate the operators’ levels of expertise, stress, and efficiency into the security design based on the existing results from the literature in psychology, including the Yerkes–Dodson law [5] and the sunk cost fallacy [6]. To assist human operators in alert inspection and response, compensate for their attentional vulnerabilities, and combat IDoS attacks, we develop human-centered technologies that selectively make some alerts less noticeable based on their category labels. Reinforcement learning is applied to make the human-assistive security technology resilient, automatic, arXiv:2111.03463v1 [cs.CR] 1 Nov 2021

Upload: others

Post on 20-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

1

RADAMS: Resilient and Adaptive Alert andAttention Management Strategy against

Informational Denial-of-Service (IDoS) AttacksLinan Huang, Student Member, IEEE, and Quanyan Zhu, Senior Member, IEEE

Abstract—Attacks exploiting human attentional vulnerabilityhave posed severe threats to cybersecurity. In this work, weidentify and formally define a new type of proactive attentionalattacks called Informational Denial-of-Service (IDoS) attacks thatgenerate a large volume of feint attacks to overload humanoperators and hide real attacks among feints. We incorporatehuman factors (e.g., levels of expertise, stress, and efficiency)and empirical results (e.g., the Yerkes–Dodson law and thesunk cost fallacy) to model the operators’ attention dynamicsand their decision-making processes along with the real-timealert monitoring and inspection. To assist human operators intimely and accurately dismissing the feints and escalating thereal attacks, we develop a Resilient and Adaptive Data-drivenalert and Attention Management Strategy (RADAMS) that de-emphasizes alerts selectively based on the alerts’ observablefeatures. RADAMS uses reinforcement learning to achieve acustomized and transferable design for various human opera-tors and evolving IDoS attacks. The integrated modeling andtheoretical analysis lead to the Product Principle of Attention(PPoA), fundamental limits, and the tradeoff among crucialhuman and economic factors. Experimental results corroboratethat the proposed strategy outperforms the default strategy andcan reduce the IDoS risk by as much as 20%. Besides, the strategyis resilient to large variations of costs, attack frequencies, andhuman attention capacities. We have recognized interesting phe-nomena such as attentional risk equivalency, attacker’s dilemma,and the half-truth optimal attack strategy.

Index Terms—Human attention vulnerability, feint attacks,reinforcement learning, risk analysis, cognitive load, alert fatigue.

I. INTRODUCTION

HUMAN vulnerability and human-induced securitythreats have been a long-standing and fast-growing prob-

lem for the security of Industrial Control Systems (ICSs).According to Verizon [2], 85% data breach involves humanerrors. Attentional vulnerability is one of the representative hu-man vulnerabilities. Adversaries have exploited human inatten-tion to launch social engineering attacks and phishing attackstoward employees and users. According to the report [3], 29%of employees fall for a phishing scam and 36% of employeessend a misdirected email owing to lack of attention. Theseattentional attacks are reactive as they exploit the existinghuman attention patterns. On the contrary, proactive attentionalattacks can strategically change the attention pattern of a

L. Huang and Q. Zhu are with the Department of Electrical and Com-puter Engineering, New York University, Brooklyn, NY, 11201, USA. E-mail:{lh2328,qz494}@nyu.edu

A preliminary version of this work [1] was presented at the 12th Conferenceon Decision and Game Theory for Security.

human operator or a network administrator. For example, anattacker can launch feint attacks to trigger a large volumeof alerts and overload the human operators so that operatorsfail to inspect the alert associated with real attacks [4]. Werefer to this new type of attacks as the Informational Denial-of-Service (IDoS) attacks, which aim to deplete the limitedattention resources of human operators to prevent them fromaccurate detection and timely defense.

IDoS attacks bring significant security challenges to ICSsfor the following reasons. First, alert fatigue has already beena serious problem in the age of infobesity with terabytes ofunprocessed data or manipulated information. IDoS attacksexacerbate the problem by generating feints to intentionallyincrease the percentage of false-positive alerts. Second, IDoSattacks directly target the human operators and security an-alysts in the Security Operations Center (SOC). Third, asICSs become increasingly complicated and time-critical, thehuman operators require higher expertise levels to understandthe domain information and detect feints. Finally, since humanoperators behave differently and IDoS attacks are a broad classof adaptive attacks, it is challenging (yet highly desirable)to develop a customized and resilient defense. There is anapparent need to understand this class of proactive attentionalattacks, quantify its consequences and risks, and developassociated mitigation strategies.

To this end, we establish a holistic model of IDoS attacks,alert generations, and the human operators’ monitoring, in-spections, and decision-making of alerts. In the IDoS attackmodel, we adopt a Markov renewal process to characterizethe sequential arrival of feints and real attacks that targetdifferent ICS components. We use revelation probability toabstract the alert generation and triage process of existingdetection systems. The revelation probability maps the attacks’hidden types and targets stochastically to the associated alerts’observable category labels. To model the human operators’attention dynamics and alert responses under the IDoS attacks,we directly incorporate the operators’ levels of expertise,stress, and efficiency into the security design based on theexisting results from the literature in psychology, includingthe Yerkes–Dodson law [5] and the sunk cost fallacy [6].To assist human operators in alert inspection and response,compensate for their attentional vulnerabilities, and combatIDoS attacks, we develop human-centered technologies thatselectively make some alerts less noticeable based on theircategory labels. Reinforcement learning is applied to makethe human-assistive security technology resilient, automatic,

arX

iv:2

111.

0346

3v1

[cs

.CR

] 1

Nov

202

1

2

Supervisory computer IDS or SIEM

Present tier-1 SOC analysts with selected alerts

Readings of pressure,

temperature, flow rate, etc.

Log files of servers,

database, workstation, etc.

Physical Alerts Cyber Alerts

Generate alerts and send them to the SOC

Mapping rules

Generation rules

Alert triage to form the category label

Tier-1 analysts monitor, inspect, and respond to alerts in real time

Tier-2 analysts apply in-

depth analysis

InspectAnalyze Impact

Correlate incidents

Apply mitigation strategies

Build threat intelligence

Feint attacks

Human-assistive security technology

Not Inspect

Human attention model

Real attacks

IDoS attack model

𝑤!" 𝑤#$𝑤%$

Dismiss Escalate

Enough Time?

No

𝑤&!

Rank and queue up for delayed inspection

Feint?NoYes

Yes

Capable?

Yes

No

Q-Learning

Fig. 1: The overview diagram of RADAMS against IDoS in ICS, which incorporates the IDoS attack model, human attention model, andthe human-assistive security technology in the red, green, and blue boxes, respectively. The processes in black are not the focus of this work.The modern SOC adopts a hierarchical alert analysis process. The tier-1 SOC analysts, also referred to as the operators, are in charge ofreal-time alert monitoring and inspections. The tier-2 SOC analysts are in charge of the in-depth analysis.

and adaptive to various human models and attack scenarios.

We illustrate the overview diagram of Resilient and Adap-tive Alert and Attention Management Strategy (RADAMS) inFig. 1. RADAMS enriches the existing alert selection frame-works with the IDoS attack model, the human attention model,and the human-assistive security technology highlighted in red,green, and blue, respectively. Through the integrated modelingand theoretical analysis, we obtain the Product Principle ofAttention (PPoA), which states that the Attentional DeficiencyLevel (ADL), i.e., the probability of incomplete alert re-sponses, and the risk of IDoS attacks depend on the productof the supply and the demand of human attention resources.The closed-form expressions under mild assumptions lead toseveral fundamental limits, including the minimum ADL andthe maximum length of de-emphasized alerts to reduce IDoSrisk. We explicitly characterize the tradeoff among crucialfactors such as the ADL, the reward of alert attention, andthe impact of alert inattention.

Finally, we propose an algorithm to learn the adaptive AMstrategy based on the operator’s alert inspection outcomes.We present several case studies based on the simulation ofdifferent IDoS attacks and alert inspecting processes. Thenumerical results show that the proposed optimal AM strategyoutperforms the default strategy and can effectively reduce theIDoS risk by as much as 20%. The strategy is also resilient toa large range of cost variations, attack frequencies, and humanattention capacities. We have observed the phenomenon of at-tentional risk equivalency, which states that the deviation fromthe optimal to sub-optimal strategies for some category labelscan reduce the risk under the default strategy to approximately

the same level. The results also corroborate that RADAMScan adapt to different category labels to strike a balance ofquantity (i.e., inspect more alerts) and quality (i.e., completealert responses to dismiss feints and escalate real attacks).Finally, we identify the attacker’s dilemma where destructiveIDoS attacks induce unbearable costs to the attacker. We alsoidentify the half-truth attack strategy as the optimal IDoSattack strategy when feints are generated at a high cost.

A. Organization of the Paper

The rest of the paper is organized as follows. The relatedworks are presented in Section II. Sections IV, IV, and Vintroduce the IDoS attack model, the human operator model,and the human-assistive security technology, respectively. Weanalyze the attentional deficiency level and the risk of IDoSattacks in closed form for the class of ambitious operatorsin Section VI. Section VII presents a case study of alertinspection under IDoS attacks and adaptive AM strategies.Section VIII concludes the paper.

II. RELATED WORKS

A. Alert Management

Previous works have applied alert management methodsduring the alert generation, detection, and response processesto mitigate alert fatigue and enhance cybersecurity.

1) Source Management: On the one hand, proactive de-fense [7] and deception techniques, including honeypots [8],[9] and moving target defense [10], have managed to reducesuccessful attacks and the resulting alerts from the source.

3

On the other hand, previous works have designed incentivemechanisms (e.g., [11], [12]) and information mechanisms(e.g., [13]) to enhance insiders’ compliance, reduce users’misbehavior, and reduce false positives.

2) Detection Management: A rich literature has attemptedto develop detection systems capable of reducing false posi-tives while maintaining the ability to detect malicious behav-iors. Methods include statistical analysis [14], fuzzy inference[15], and machine learning approaches [16]. Alert aggregationand correlation methods [17] have also been applied to dismissrepeated and innocuous alerts and generate alerts of system-level threat information.

3) Response Management: Despite the significant advancesof alert reduction methods introduced in Section II-A1 andII-A2, the demand for alert inspection still exceeds the opera-tors’ capacity. To this end, researchers have developed alertselection and prioritization approaches through simulated-based optimization [18], fuzzy logic [19], deep learning [20],and reinforcement learning [21]. Compared to the works thatrank the alerts based on their contextual information, ourhuman-centered approach explicitly models the attentionalbehaviors of human operators and selects alerts based onhuman cognitive capacity.

B. Feint Attacks and Human Attentional ModelsThere is a rich literature on attacks against detection systems

[22]. In particular, the authors in [23], [24] have developedtools that can generate false positives by matching detectionsignatures. The tools are tested on SNORT [25] and theempirical results verify the feasibility of feint attacks ondetection systems. Different from the practical feint attacksthat focus on the vulnerability of detection systems, we focuson the attentional vulnerabilities and the impact of feints onhuman operators. Moreover, we abstract models to formallycharacterize general feint attacks, enabling us to quantify therisk and develop human-assistive security technologies.

We can classify human vulnerabilities into acquired vulner-abilities (e.g., lack of security awareness and noncompliance)and innate ones (e.g., bounded attention and rationality) basedon whether they can be mitigated through short-term trainingand security rules. Many works (e.g., [11], [13], [26]) haveemphasized the urgency and necessity to reduce acquiredhuman vulnerability and proposed human-assistive strategies.However, few works have focused on mitigation strategiesfor innate vulnerabilities. Visual support systems have beenused for rapid cyber event triage [27] and alert investigations[28], and eye-tracking data have been incorporated to enhanceattention for phishing identification [29]. The authors in [30]perform an anthropological study in a corporate SOC tomodel and mitigate security analyst burnout. These workslay the foundations of empirical solutions to mitigate hu-man attentional vulnerabilities. Our work combines real-timehuman behavioral and decision data with the well-identifiedhuman factors to enable quantitative characterizations of theempirical relationship such as the Yerkes–Dodson law [5].The learning-based method for attention handling also makesour human-assistive technology adaptive and transferable tovarious human-technical systems.

III. IDOS ATTACKS AND SEQUENTIAL ALERT ARRIVALS

As illustrated in the first column of Fig. 1, after the IDoSattacker has generated feint and real attacks, the detectionsystem monitors the readings from physical layers and logfiles from cyber layers and generates alerts according to thegeneration rules. Then, the alerts are sent to the SOC anda triage system automatically generates their category labels(e.g., the alerts’ criticality) based on the mapping rules. Therules for alert generation and triage mapping are pre-definedand their designs are not the focus of this work.

A. Feint and Real Attacks of Heterogeneous Targets

After the initial intrusion, privilege escalation, and lateralmovement, IDoS attackers can launch feint and real attackssequentially as illustrated by the solid red arrows in Fig.2. With a deliberate goal of triggering alerts, feint attacksrequire fewer resources to craft. Although feints have limitedimpacts on the target system, they aggravate the alert fatigueby depleting human attention resources and preventing humanoperators from a timely response to real attacks. For example,the attacker can attempt to access a database with wrong cre-dentials intentionally, and in the meantime, gradually changesthe temperature of the reactor of a nuclear power plant.The repeated log-in attempts trigger an excessive number ofalerts so that the overloaded human operators fail to paysustained attention and respond timely to the sensor alerts ofthe temperature deviation.

We denote feint and real attacks as θFE and θRE , respec-tively, where Θ := {θFE ,θRE} is the set of attacks’ types.Each feint or real attack can target cyber components (e.g.,servers, databases, and workstations) or physical components(e.g., sensors of pressure, temperature, and flow rate) in theICS. We define Φ as the set of the potential attack targets.The stochastic arrival of these attacks is modeled as a Markovrenewal process where tk,k ∈ Z0+, is the time of the k-tharrival. We refer to the k-th attack equivalently as the attack atattack stage k ∈Z0+ and let θ k ∈Θ and φ k ∈Φ be the attack’stype and target at attack stage k ∈ Z0+, respectively. DefineκAT ∈KAT : Θ×Φ×Θ×Φ 7→ [0,1] as the transition kernelwhere κAT (θ

k+1,φ k+1|θ k,φ k) denotes the probability that the(k+1)-th attack has type θ k+1 ∈Θ and target φ k+1 ∈Φ whenthe k-th attack has type θ k ∈ Θ and target φ k ∈Φ. The inter-arrival time τk := tk+1− tk is a continuous random variablewith support [0,∞) and Probability Density Function (PDF)z ∈ Z : Θ×Φ×Θ×Φ 7→ R0+ where z(t|θ k+1,φ k+1,θ k,φ k)is the probability that the inter-arrival time is t when theattacks’ types and targets at attack stage k and k+1 are θ k,φ k

and θ k+1,φ k+1, respectively. The values of κAT ∈ KAT andz ∈ Z are unknown to human operators and the designer ofRADAMS. Attackers can adapt κAT and z to different ICSand alert inspection schemes to achieve the attack goals. Weformally define IDoS attacks in Definition 1.

Definition 1 (IDoS Attacks). An IDoS attack is a sequenceof feint and real attacks of heterogeneous targets, which canbe characterized by the 4-tuple (Θ,Φ,KAT ,Z ).

4

Hidden Attacks

Manual Inspections and Decisions

<latexit sha1_base64="R4av/rPk/xOscipVRiYOQOAl36w=">AAACL3icdZDNSgMxFIUz/lv/2rp0EyyCqzJTq3VZdONSwdpCW0omvbWhmWRI7ihl6EO41afwacSNuPUtTOsoKnoh5OOce8nNCWMpLPr+szc3v7C4tLyymltb39jcyheKV1YnhkODa6lNK2QWpFDQQIESWrEBFoUSmuHodOo3b8BYodUljmPoRuxaiYHgDJ3U7PC+Rpvr5Ut+uRYEh5UK9cv+rBwE1WpwUKFBppRIVue9glfs9DVPIlDIJbO2HfgxdlNmUHAJk1wnsRAzPmLX0HaoWAS2m872ndA9p/TpQBt3FNKZ+n0iZZG14yh0nRHDof3tTcW/vHaCg+NuKlScICj+8dAgkRQ1nX6e9oUBjnLsgHEj3K6UD5lhHF1EuY6CWxyCNhCl2T1JLzP44XJttJTMjCfp6RdOU/yMiv4PV5VycFQ+uKiW6idZnitkh+ySfRKQGqmTM3JOGoSTEbkj9+TBe/SevBfv9aN1zstmtsmP8t7eAT6wqnE=</latexit>· · ·𝜃!, 𝜙! 𝜃!"#, 𝜙!"#

𝜏!

𝑡$! 𝑡$!"#

𝑠! 𝑠!"#Observable Alerts with AM Strategy

. . .

𝑡! 𝑡!"# Time 𝑡 ∈ [0,∞)

<latexit sha1_base64="R4av/rPk/xOscipVRiYOQOAl36w=">AAACL3icdZDNSgMxFIUz/lv/2rp0EyyCqzJTq3VZdONSwdpCW0omvbWhmWRI7ihl6EO41afwacSNuPUtTOsoKnoh5OOce8nNCWMpLPr+szc3v7C4tLyymltb39jcyheKV1YnhkODa6lNK2QWpFDQQIESWrEBFoUSmuHodOo3b8BYodUljmPoRuxaiYHgDJ3U7PC+Rpvr5Ut+uRYEh5UK9cv+rBwE1WpwUKFBppRIVue9glfs9DVPIlDIJbO2HfgxdlNmUHAJk1wnsRAzPmLX0HaoWAS2m872ndA9p/TpQBt3FNKZ+n0iZZG14yh0nRHDof3tTcW/vHaCg+NuKlScICj+8dAgkRQ1nX6e9oUBjnLsgHEj3K6UD5lhHF1EuY6CWxyCNhCl2T1JLzP44XJttJTMjCfp6RdOU/yMiv4PV5VycFQ+uKiW6idZnitkh+ySfRKQGqmTM3JOGoSTEbkj9+TBe/SevBfv9aN1zstmtsmP8t7eAT6wqnE=</latexit>· · ·𝑡% = 0

. . . . . .

. . .𝑤! 𝑤!"#

𝑡$!$#

𝑎&'# 𝑎& 𝑎&"#

Fig. 2: The timelines of an IDoS attack, alerts under AM strategies, and manual inspections are depicted in red, blue, and green, respectively.The inspection stage h ∈ Z0+ is equivalent to the attack stage Ih ∈ Z0+. The red arrows represent the sequential arrivals of feints and realattacks. The semi-transparent blue the dashed green arrows represent the de-emphasized alerts and the alerts without inspections, respectively.

B. Alert Triage Process and System-Level Metrics

The alerts triggered by IDoS attacks contain device-levelcontextual information, including the software version, hard-ware parameters, existing vulnerabilities, and security patches.The alert triage process consists of rules that map the device-level information to the system-level metrics, which helps hu-man operators make timely responses. Some essential metricsare listed as follows.• Source sSO ∈SSO: The ICS sensors or the cyber compo-

nents that the alerts are associated with.• Time Sensitivity sT S ∈ST S: The length of time that the

potential attack needs to achieve its attack goals.• Complexity sCO ∈ SCO: The degree of effort that a

human operator takes to inspect the alert.• Susceptibility sSU ∈SSU : The likelihood that the attack

succeeds and inflicts damage on the protected system.• Criticality sCR ∈SCR: The consequence or the impact of

the attack’s damage.These alert metrics are observable to the human operator andthe RADAMS designer, and they form the category label ofan alert. We define the category label associated with the k-thalert as sk := (sk

SO,skT S,s

kCO,s

kSU ,s

kCR) ∈S where S := SSO×

ST S×SCO×SSU ×SCR. The joint set S can be adapted tosuit the organization’s security practice. For example, we haveST S = /0 if time sensitivity is unavailable or unimportant.

The alert triage process establishes a stochastic connectionbetween the hidden types and targets of IDoS attacks andthe observable category labels of the associated alerts. Leto(sk|θ k,φ k) be the probability of obtaining category labelsk ∈ S when the associated attack has type θ k ∈ Θ andtarget φ k ∈ Φ. The revelation kernel o reflects the qualityof the alert triage. For example, feints with lightweightresource consumption usually have a limited impact. Thus, ahigh-quality triage process should classify the associated alertas low criticality with a high probability. Letting b(θ k,φ k)denote the probability that the k-th attack has type θ k andtarget φ k at the steady-state, we can compute the steady-statedistribution b in closed form based on κAT . Then, the transitionof category labels at different attack stages is also Markov

and is represented by κCL ∈KCL : S ×S 7→ [0,1]. We cancompute κCL = Pr(sk+1,sk)

∑sk+1∈S Pr(sk+1,sk)based on κAT ,o,b, where

Pr(sk+1,sk) = ∑θ k,θ k+1∈Θ ∑φ k,φ k+1∈ΦκAT (θ

k+1,φ k+1|θ k,φ k)

o(sk|θ k,φ k)o(sk+1|θ k+1,φ k+1)b(θ k,φ k). In this work, wefocus on the case where the detection system introducesthe same delay between attacks and their triggered alerts.Since the sequences of attacks and alerts have a one-to-onemapping, we can consider zero delay time without loss ofgenerality. Hence, the sequence of alerts associated with anIDoS attack (Θ,Φ,KAT ,Z ) is also a Markov renewal processcharacterized by the 3-tuple (S ,KCL,Z ).

IV. HUMAN ATTENTION MODEL UNDER IDOS ATTACKS

An SOC typically adopts a hierarchical alert analysis [31].The attention model in this section applies to the tier-1 SOCanalysts, or the operators, who are in charge of monitoring,inspecting, and responding to alerts in real time. As illustratedby the green box in Fig. 1, the operators choose to inspectcertain alerts, dismiss the feints, and escalate the real attacksto tier-2 SOC analysts for in-depth analysis. The in-depthanalysis can last hours to months, during which the tier-2analysts correlate incidents from different components in theICS over long periods to build threat intelligence and analyzethe impact. The threat intelligence is then incorporated to formand update the generation rules of the detection system andmapping rules of the triage process.

A. Alert Responses

Due to the high volume of alerts and the potential short-term surge arrivals, human operators cannot inspect all alertsin real time. The uninspected alerts receive an alert responsewNI . Whether the operator chooses to inspect an alert dependson the switching probability in Section IV-B.

When the operator inspects an alert, he can be distractedby the arrival of new alerts and switch to newly-arrived alertswithout completing the current inspection. We elaborate onthe attention dynamics in Section IV-C. The alert with incom-plete inspection is labeled by wUN . Besides the insufficient

5

inspection time, the operator’s cognitive capacity constraintcan also prevent him from determining whether the alert istriggered by a feint or a real attack. In this work, we considerprudent operators. When they cannot determine the attack’stype after a full inspection, the associated alert is labeled aswUN . We elaborate on how the insufficient inspection time andthe operator’s cognitive capacity constraint lead to wUN , i.e.,referred to as the inadequate alert response, in Section IV-D.The alerts labeled as wNI and wUN are ranked and queued upfor delayed inspections at later stages.

When the operator successfully completes the alert inspec-tion with a deterministic decision, he either dismisses the alert(denoted by wFE ) or escalates the alert to tier-2 SOC analystsfor in-depth analysis (denoted by wRE ) as shown in Fig. 1. Weuse wk ∈W := {wFE ,wRE ,wUN ,wNI} to denote the operator’sresponse to the alert at attack stage k∈Z0+. We can extend theset W to suit the organization’s security practice. For example,some organizations let the operators report their estimationsand confidence levels concerning incomplete alert inspection,i.e., divide the label wUN into finer subcategories. At laterstages, the delayed inspection prioritizes the alerts estimatedto be associated with real attacks of high confidence levels.

B. Probabilistic Switches within Allowable Delay

Alerts are monitored in real time when they arrive. When thecategory label of the new alert indicates higher time sensitivity,susceptibility, or criticality, the operator can delay the currentinspection (i.e., label the alert under inspection as wUN) andswitch to inspect the new alert. We denote κ∆k

SW (sk+∆k|sk) asthe operator’s switching probability when the previous alertat attack stage k and the new alert at stage k+∆k,∆k ∈ Z+,have category label sk ∈S and sk+∆k ∈S , respectively. As aprobability measure,

∑∆k=1

∑sk+∆k∈S

κ∆kSW (sk+∆k|sk)≡ 1,∀k ∈ Z0+,∀sk ∈S . (1)

Since the operator cannot observe the attack’s hidden type andhidden target, the switching probability κ∆k

SW is independent ofθ k,φ k and θ k+1,φ k+1. The switching probability depends onthe time that the operator has already spent on the currentinspection. For example, an operator becomes less likely toswitch after spending a long time inspecting an alert of lowcriticality or beyond his capacity, which can lead to the SunkCost Fallacy (SCF).

We denote Dmax ∈ R+ as the Maximum Allowable Delay(MAD). At time t ≥ tk, the k-th alert’s Age of Information(AoI) [32] is defined as tk

AoI := t− tk. This work focuses ontime-critical ICSs where a defensive response for the k-th alertis only effective when the alert’s AoI is within the MAD, i.e.,tkAoI ≤Dmax. Therefore, the operator will be reminded when an

alert’s AoI exceeds the MAD so that he can switch to monitorand inspect new alerts. The MAD and the reminder schemehelp mitigate the SCF when the operators are occupied withold alerts and miss the chance to monitor and inspect newalerts in real time.

C. Attentional Factors

We identify the following human and environmental factorsaffecting operators’ alert inspection and response processes.• The operator’s expertise level denoted by yEL ∈ YEL.• The k-th alert’s category label sk ∈S .• The k-th attack’s type θ k and target φ k.• The operator’s stress level yt

SL ∈R+, which changes withtime t as new alerts arrive.

The first three factors are the static attributes of the analyst,the alert, and the IDoS attack, respectively. They determine theaverage inspection time, denoted by d(yEL,sk,θ k,φ k)∈R+, toreach a complete response wFE or wRE . For example, if theinspected alert is of low complexity, the operator can reach acomplete response in a shorter time. Also, it takes a senior op-erator less time on average to reach a complete alert responsethan a junior one does. We use d(yEL,sk,θ k,φ k) to representthe Actual Inspection Time Needed (AITN) when the operatoris of expertise level yEL, the alert is of category label sk, andthe attack has type θ k and target φ k. AITN d(yEL,sk,θ k,φ k)is a random variable with mean d(yEL,sk,θ k,φ k).

The fourth factor reflects the temporal aspect of humanattention during the inspection process. Evidence has shownthat the continuous arrival of the alerts can increase thestress level of human operators [33] and 52% of employeesattributes their mistakes to stress [3]. We denote nt ∈ Z0+ asthe number of alerts that arrive during the current inspectionup to time t ∈ [0,∞) and model the operator’s stress levelyt

SL as an increasing function fSL of nt , i.e., ytSL = fSL(nt).

At time t ∈ [0,∞), the human operator’s Level of OperationalEfficiency (LOE), denoted by ω t ∈ [0,1], is a function fLOEof the stress level yt

SL, i.e.,

ωt = fLOE(yt

SL) = ( fLOE ◦ fSL)(nt),∀t ∈ [0,∞). (2)

Based on the Yerkes–Dodson law, the function fLOE followsan inverse U-shape that contains the following two regions.In region one, a small number of alerts result in a moderatestress level and allow human operators to inspect the alertefficiently. In region two, the LOE starts to decrease whenthe number of alerts to inspect is beyond some thresholdn(yEL,sk) and the human operator is overloaded. The valueof the attention threshold n(yEL,sk) depends on the operator’sexpertise level yEL ∈YEL and the alert’s category label sk ∈S .For example, it requires more (resp. fewer) alerts (i.e., higher(resp. lower) attention threshold) to overload a senior (resp.an inexperienced) operator. We can also adapt the value ofn(yEL,sk) to different scenarios. In the extreme case whereall alerts are of high complexity and create a heavy cognitiveload, we let n(yEL,sk) = 0,∀yEL ∈ YEL,sk ∈S , and the LOEdecreases monotonously with the number of alert arrivalsduring an inspection.

D. Alert Responses under Time and Capacity Limitations

After we identify attentioinal factors in Section IV-C, weillustrate their impacts on the operators’ alert responses asfollows. We define the Effective Inspection Time (EIT) duringinspection time [t1, t2] as the integration ω t1,t2 :=

∫ t2t1 ω tdt.

When the operator is overloaded and has a low LOE during

6

[t1, t2], the EIT ω t1,t2 is much shorter than the actual inspectiontime t2− t1.

Suppose that the operator of expertise level yEL inspectsthe k-th alert for a duration of [t1, t2]. If the EIT has exceedthe AITN d(yEL,sk,θ k,φ k), then the operator can reach acomplete response wFE or wRE with a high success proba-bility denoted by pSP(yEL,sk,θ k,φ k) ∈ [0,1]. However, whenω t1,t2 < d(yEL,sk,θ k,φ k), it indicates that the operator has notcompleted the inspection and the alert response concerning thek-th alert is wk =wUN . The success probability pSP depends onthe operator’s capacity to identify attacks’ types, which leadsto the definition of the capacity gap below.

Definition 2 (Capacity Gap). For an operator of exper-tise level yEL ∈ YEL, we define pCG(yEL,sk,θ k,φ k) := 1−pSP(yEL,sk,θ k,φ k) as his capacity gap to inspect an alert withcategory label sk ∈S , type θ k ∈Θ, and target φ k ∈Φ definedin Section III.

V. HUMAN-ASSISTIVE SECURITY TECHNOLOGY

As illustrated in Section IV, the frequent arrival of alertstriggered by IDoS attacks can overload the human operator andreduce the LOE and the EIT. To compensate for the human’sattentional limitation, we can intentionally make some alertsless noticeable, e.g., without sounds or in a light color, basedon their category labels. As illustrated by the blue box inFig. 1, based on the category labels from the triage process,RADAMS automatically emphasizes and de-emphasizes alerts,and then presents them to the tier 1 SOC analysts.

A. Adaptive AM Strategy

In this work, we focus on the class of Attention Man-agement (AM) strategies, denoted by A := {am}m∈{0,1,··· ,M},that de-emphasize consecutive alerts. As explained in SectionIV-A, the operator can only inspect some alerts in real time.Thus, we use Ih ∈ Z0+ and tIh ∈ [0,∞) to denote the indexand the time of the alert under the h-th inspection; i.e., theinspection stage h ∈ Z0+ is equivalent to the attack stageIh ∈ Z0+. Whenever the operator starts a new inspection atinspection stage h∈Z0+, RADAMS determines the AM actionah ∈A for the h-th inspection based on the stationary strategyσ ∈ Σ : S 7→ A that is adaptive to the category label of theh-th alert. We illustrate the timeline of the manual inspectionsand the AM strategies in green and blue, respectively, in Fig.2. The solid and dashed green arrows indicate the inspectedand uninspected alerts, respectively. The non-transparent andsemi-transparent blue arrows indicate the emphasized andde-emphasized alerts, respectively. At inspection stage h, ifah = am, RADAMS will make the next m alerts less no-ticeable; i.e., the alerts at attack stages Ih + 1, · · · , Ih +m arede-emphasized. Denote κ

Ih+1−Ih,ah

SW (sIh+1 |sIh) as the operator’sswitching probability to these de-emphasized alerts under theAM action ah ∈ A . Analogously to (1), the following holdsfor all h ∈ Z0+ and ah ∈A , i.e.,

∑Ih+1=Ih+1

∑sIh+1∈S

κIh+1−Ih,ah

SW (sIh+1 |sIh)≡ 1,∀sIh ∈S . (3)

The deliberate de-emphasis on selective alerts brings the fol-lowing tradeoff. On the one hand, these alerts do not increasethe operator’s stress level and the operator can pay sustainedattention to the alert under inspection with high LOE and EIT.On the other hand, these alerts do not draw the operator’sattention and the operator is less likely to switch to themduring the real-time monitoring and inspections.

Since the operator may switch to inspect a de-emphasizedalert with switching probability κ

Ih+1−Ih,ah

SW (e.g., the h-inspection in Fig. 2), RADAMS recomputes the AM strategyand implements the new strategy whenever the operator hasstarted to inspect a new alert. Although the operator can switchunpredictably, Proposition 1 shows that the transition of theinspected alerts’ category labels is Markov.

Proposition 1. For a stationary AM strategy σ ∈ Σ, the set ofrandom variables (SIh ,TIh)h∈Z0+ is a Markov renewal process.

Proof. The sketch of the proof includes two steps. First, weprove that the state transition from sIh to sIh+1 is Markov for allh∈Z0+. Due to the uncertainty of switching in inspection, thetransition stage Ih+1 is also a random variable for all h ∈ Z0+

and we can represent the transition probability as

Pr(SIh+1 = sIh+1 |sIh) = ∑∞l=1 Pr(Ih+1 = Ih + l) ·Pr(SIh+1 = sIh+1 |sIh),

where Pr(Ih+1 = Ih + l) is the probability that the (h +1)-th inspection happens at attack stage Ih + l. The termPr(SIh+1 = sIh+1 |sIh) is Markov and can be computedbased on κCL. The term Pr(Ih+1 = Ih + l) depends ond(yEL,sIh+l′ ,θ Ih+l′ ,φ Ih+l′), κ l′

SW , κ l′SW , τ l′ , for all l′ ∈{1, · · · , l}.

Since sIh+l′ ,θ Ih+l′ ,φ Ih+l′ , l′ ∈ {1, · · · , l}, are all stochasticallyrelated to sIh and sIh+1 based on o, κAT and κCL, the termPr(Ih+1 = Ih + l) depends on sIh and sIh+1 for all l ∈ Z+.

Then, we show that the distribution of the inter-arrivaltime TIh,m

IN := TIh+1 − TIh only depends on sIh and sIh+1 .Analogously, the cumulative distribution function of TIh,m

IN is

Pr(TIh,mIN ≤ t) =

∑l=1

Pr(Ih+1 = Ih + l) ·Pr(TIh,mIN ≤ t),

and hence we arrive at the Markov property.

B. Stage Cost and Expected Cumulative Cost

For each alert at attack stage k ∈ Z0+, RADAMS assigns astage cost c(wk,sk) based on the alert response wk ∈W and thecategory label sk ∈S . The value of the cost can be estimatedby the salary of SOC analysts and the estimated loss of theassociated attack. For example, c(wUN ,sIh) and c(wNI ,sIh) arepositive costs as those alerts without a complete response incuradditional workloads. The delayed inspections also expose theorganization to the threats of time-sensitive attacks. On theother hand, c(wFE ,sIh) and c(wRE ,sIh) are negative costs asthe alerts with complete alert response wFE and wRE reducethe workload of tier 2 SOC analysts and enable them to obtainthreat intelligence.

When the operator starts a new inspection at inspectionstage h + 1, RADAMS will evaluate the effectiveness ofthe AM strategy for the h-th inspection. The performanceevaluation is reflected by the Expected Consolidated Cost

7

(ECoC) c : S ×A 7→ R at each inspection stage h ∈ Z0+.We denote the realization of c(sIh ,ah) as the ConsolidatedCost (CoC) cIh(sIh ,ah). Since the AM strategy σ at eachinspection stage can affect the future human inspection processand the alert responses, we define the Expected CumulativeCost (ECuC) u(sIh ,σ) := ∑

∞h=0 γhc(sIh ,σ(sIh)) under adaptive

strategy σ ∈ Σ as the long-term performance measure. Thegoal of the assistive technology is to design the optimaladaptive strategy σ∗ ∈ Σ that minimizes the ECuC u under thepresented IDoS attack based on the category label sIh ∈S ateach inspection stage h. We define v∗(sIh) := minσ∈Σ u(sIh ,σ)as the optimal ECuC when the category label is sIh ∈ S .We refer to the default AM strategy σ0 ∈ Σ as the onewhen no AM action is applied under all category labels, i.e.,σ0(sIh) = a0,∀sIh ∈S .

C. Reinforcement Learning

Due to the absence of the following exact model parameters,RADAMS has to learn the optimal AM strategy σ∗ ∈ Σ basedon the operator’s alert responses in real time.• Parameters of the IDoS attack model (e.g., κAT and z)

and the alert generation model (e.g., o) in Section III.• Parameters of the human attention model (e.g., fLOE and

fSl), inspection model (e.g., κ∆kSW , κ

Ih+1−Ih,ah

SW , and d), andalert response model (e.g., yEL and pSP) in Section IV.

Define Qh(sIh ,ah) as the estimated ECuC during the h-thinspection when the category label is sIh ∈ S and the AMaction is ah. Based on Proposition 1, the state transition isMarkov, which enables Q-learning as follows.

Qh+1(sIh ,ah) := (1−αh(sIh ,ah))Qh(sIh ,ah)

+αh(sIh ,ah)[cIh(sIh ,ah)+ γ min

a′∈AQh(sIh+1 ,a′)], (4)

where sIh and sIh+1 are the observed category labels of thealerts at the attack stage Ih and Ih+1, respectively. When thelearning rate αh(sIh ,ah) ∈ (0,1) satisfies ∑

∞h=0 αh(sIh ,ah) =

∞,∑∞h=0(α

h(sIh ,ah))2 < ∞,∀sIh ∈ S ,∀ah ∈ A , and all state-action pairs are explored infinitely, mina′∈A Qh(sIh ,a′) con-verges to the optimal ECuC v∗(sIh) with probability 1 ash→ ∞. At each inspection stage h ∈ Z0+, RADAMS selectsAM strategy ah ∈ A based on the ε-greedy policy; i.e.,RADAMS chooses a random action with a small probabilityε ∈ [0,1], and the optimal action argmina′∈A Qh(sIh ,a′) withprobability 1− ε .

We present the algorithm to learn the adaptive AM strategybased on the operator’s real-time alert monitoring and inspec-tion process in Algorithm 1. Each simulation run correspondsto the operator’s work shift of 24 hours at the SOC. Since theSOC can receive over 10 thousand of alerts in each work shift,we can use infinite horizon to approximate the total numberof attack stages K > 10,000. Whenever the operator starts toinspect a new alert at inspection stage Ih+1, RADAMS appliesQ-learning in (4) based on the category label sIh+1 of the newlyarrived alert and determines the AM action ah+1 for the h+1inspection based on the ε-greedy policy as shown in lines12 and 19 of Algorithm 1. The CoC cIh(sIh ,ah) of the h-thinspection under the AM action ah ∈A and the category label

Algorithm 1: Algorithm to Learn the Adaptive AM strat-egy based on the Operator’s Real-Time Alert Inspection

1 Input K: The total number of attack stages;2 Initialize The operator starts the h-th inspection under

AM action ah ∈A ; Ih = k0; cIh(sIh ,ah) = 0;3 for k← k0 +1 to K do4 if The operator has finished the Ih-th alert (i.e.,

EIT > AITN), then5 if Capable (i.e., rand≤ pSP(yEL,sk,θ k,φ k)) then6 Dismiss (i.e., wIh = wFE ) or escalate (i.e.,

wIh = wRE ) the Ih-th alert;7 else8 Queue up the Ih-th alert, i.e., wIh = wUN ;9 end

10 cIh(sIh ,ah) = cIh(sIh ,ah)+ c(wIh ,sIh);11 Ih+1← k; The operator starts to inspect the k-th

alert with category label sIh+1 ;12 Update Qh+1(sIh ,ah) via (4) and obtain the AM

action ah+1 by ε-greedy policy;13 ch+1(sIh+1 ,ah+1) = 0; h← h+1;14 else15 if The operator chooses to switch or The MAD is

reached, i.e., tk− tIh ≥ Dmax then16 Queue up the Ih-th alert (i.e., wIh = wUN);17 cIh(sIh ,ah) = cIh(sIh ,ah)+ c(wUN ,sIh);18 Ih+1← k; The operator starts to inspect the k-th

alert with category label sIh+1 ;19 Update Qh+1(sIh ,ah) via (4) and obtain the AM

action ah+1 by ε-greedy policy;20 ch+1(sIh+1 ,ah+1) = 0; h← h+1;21 else22 The operator continues the inspection of the

Ih-th alert with decreased LOE;23 The k-th alert is queued up for delayed

inspection (i.e., wk = wNI);24 cIh(sIh ,ah) = cIh(sIh ,ah)+ c(wNI ,sk);25 end26 end27 end28 Return Qh(s,a),∀s ∈S ,a ∈A ;

sIh of the inspected alert can be computed iteratively based onthe stage cost c(wk,sk) of the alerts during the attack stagek ∈ {Ih, · · · , Ih+1 − 1} as shown in lines 13, 20, and 24 ofAlgorithm 1.

VI. THEORETICAL ANALYSIS

In Section VI, we focus on the class of ambitious opera-tors who attempt to inspect all alerts, i.e., κSW (sk+∆k|sk) =1{∆k=1},∀sk,sk+∆k ∈ S ,∀∆k ∈ Z+. To assist this class ofoperators, the implemented AM action am,m ∈ {0,1, · · · ,M},chooses to make the selected alerts fully unnoticeable.Then, under am ∈ A , the operator at inspection stage hcan pay sustained attention to inspect the alert of cat-egory label sIh ∈ S for m + 1 attack stages. Moreover,

8

the operator switches to the new alert at attack stageIh+1, i.e., ∑sIh+m+1∈S κ

Ih+1−Ih,amSW (sIh+m+1|sIh) = 1{Ih+1−Ih=m+1}.

Throughout the section, we omit the variable of the expertiselevel yEL in functions d, d, pSP, and pCG as yEL is a constantfor all attack stages.

A. Security Metrics

We propose two security metrics in Definition 3 to evaluatethe performance of ambitious operators under IDoS attacksand different AM strategies. The first metric, denoted aspUN(sIh ,ah), is the probability that the operator chooses wUNduring the h-th inspection under the category label sIh ∈ Sand AM action ah ∈ A . This metric reflects the AttentionalDeficiency Level (ADL) of the IDoS attack. For example, asthe attackers generate more feints at a higher frequency, theoperator is persistently distracted by the new alerts, and itbecomes unlikely for him to fully respond to an alert. TheADL pUN(sIh ,ah) is high in this scenario. We use the ECuCu(sIh ,σ) as the second metric that evaluates the IDoS riskunder the category label sIh ∈S and the AM strategy σ ∈ Σ.For both metrics, smaller values are preferred.

Definition 3 (Attentional Deficiency Level and Risk). Undercategory label sIh ∈S and the stationary AM strategy σ ∈ Σ,we define pUN(sIh ,σ(sIh)) and u(sIh ,σ) as the AttentionalDeficiency Level (ADL) and the risk of the IDoS attacksdefined in Section III, respectively.

B. Closed-Form Computations

The Markov renewal process that characterizes the IDoSattack or the associated alert sequence follows a Poissonprocess when Condition 1 holds.

Condition 1 (Poisson Arrival). The inter-arrival timeτk for all attack stage k ∈ Z0+ is exponentially dis-tributed with the same arrival rate denoted by β > 0, i.e.,z(τ|θ k+1,φ k+1,θ k,φ k) = βe−βτ ,τ ∈ [0,∞) for all θ k+1,θ k ∈Θ

and φ k+1,φ k ∈Φ.

Recall that random variable TIh,mIN represents the inspection

time of the Ih-th alert under the AM action ah = am ∈ A .For the ambitious operators under AM action am ∈ A atinspection stage h, the next inspection happens at attack stageIh+1 = Ih +m+1. Thus, Ih+1 is no longer a random variable.As a summation of m+1 i.i.d. exponential distributed randomvariables of rate β , TIh,m

IN follows an Erlang distributiondenoted by z with shape m + 1 and and rate β > 0 whencondition 1 holds, i.e., z(τ) = β m+1τme−βτ

m! ,τ ∈ [0,∞).Denote ph

SD(wIh |sIh ,ah;θ Ih ,φ Ih) as the probability that the

operator makes alert response wIh at inspection stage h.To obtain a theoretical underpinning, we consider the casewhere the AITN equals the average inspection time, i.e.,d(sk,θ k,φ k) = d(sk,θ k,φ k). Then, the operator under AMaction am makes a complete alert response (i.e., wIh ∈{wFE ,wRE}) at inspection stage h for category label sIh

if the inspection time τIh,mIN is greater than the AITN.

The probability of the above event can be represented

as∫

d(sIh ,θ Ih ,φ Ih )pSP(sIh ,θ Ih ,φ Ih)z(τ)dτ = pSP(sIh ,θ Ih ,φ Ih) ·

∑mn=0

1n! e−βd(sIh ,θ Ih ,φ Ih )(βd(sIh ,θ Ih ,φ Ih))n, which leads to

phSD(wUN |sIh ,am;θ

Ih ,φ Ih) = 1− pSP(sIh ,θ Ih ,φ Ih)

·m

∑n=0

1n!

e−βd(sIh ,θ Ih ,φ Ih )(βd(sIh ,θ Ih ,φ Ih))n.(5)

Then, the ADL pUN(sIh ,ah) can be computed as

∑θ

Ih∈Θ,φ Ih∈Φ

Pr(θ Ih ,φ Ih |sIh) · phSD(wUN |sIh ,ah;θ

Ih ,φ Ih), (6)

where the conditional probability Pr(θ Ih ,φ Ih |sIh) can becomputed via the Bayesian rule, i.e., Pr(θ Ih ,φ Ih |sIh) =

o(sIh |θ Ih ,φ Ih )b(θ Ih ,φ Ih )

∑θ

Ih∈Θ,φ Ih∈Φo(sIh |θ Ih ,φ Ih )b(θ Ih ,φ Ih )

.

We can compute the ECoC c(sIh ,am) explicitly as

c(sIh ,am) = mc(wNI ,sIh)+ ∑θ

Ih∈Θ,φ Ih∈Φ

Pr(θ Ih ,φ Ih |sIh)

· ∑wIh∈W

phSD(w

Ih |sIh ,am;θIh ,φ Ih)c(wIh ,sIh).

(7)

For prudent operators in Section IV-A, we have

phSD(wi|sIh ,ah;θi,φ

Ih) = 1− phSD(wUN |sIh ,ah;θi,φ

Ih), (8)

for all i ∈ {FE,RE},sIh ∈S ,ah ∈A ,φ Ih ∈Φ,h ∈ Z0+. Plug-ging (8) into (7), we can simplify the ECoC c(sIh ,am) as

c(sIh ,am) = ∑φ

Ih∈Φ

∑i∈{FE,RE}

Pr(θi,φIh |sIh) · ph

SD(wi|sIh ,am;θi,φIh)

· [c(wi,sIh)− c(wUN ,sIh)]+mc(wNI ,sIh)+ c(wUN ,sIh).

(9)

As shown in Proposition 2, the ADL and the risk are monotonefunction of βd(sIh ,θ Ih ,φ Ih) for each AM strategy.

Proposition 2. If condition 1 holds, then the ADLpUN(sIh ,σ(sIh)) and the risk u(sIh ,σ) of an IDoS attack undercategory label sIh ∈S and AM strategy σ ∈ Σ increase in thevalue of the product βd(sIh ,θ Ih ,φ Ih).

Proof. First, since phSD(wUN) in (5) increases monotonously

with respect to the product βd(sIh ,θ Ih ,φ Ih), the values ofph

SD(wFE) and phSD(wRE) in (8) decrease monotonously with

respect to the product. Plugging (5) into (6), we obtain thatpUN(sIh ,am) in (10) under any am ∈ A and sIh ∈ S is asummation of functions increasing in βd(sIh ,θ Ih ,φ Ih).

pUN(sIh ,am) = ∑φ

Ih∈Φ

∑i∈{FE,RE}

Pr(θi,φIh |sIh)[1−

pSP(sIh ,θi,φIh) ·

m

∑n=0

1n!

e−βd(sIh ,θi,φIh )(βd(sIh ,θi,φ

Ih))n].

(10)

Second, since c(wFE ,sIh) and c(wRE ,sIh) are negative andc(wUN ,sIh) is positive, the ECoC in (9) decreases withβd(sIh ,θ Ih ,φ Ih) under any am ∈ A and sIh ∈ S . Then, therisk also decreases with the product due to the monotonicityof the Bellman operator [34].

Remark 1 (Product Principle of Attention (PPoA)). On theone hand, as β increases, the feint and real attacks arrive ata higher frequency on average, resulting in a higher demandof attention resources from the human operator. On the other

9

hand, as d(sIh ,θ Ih ,φ Ih) increases, the human operator requiresa longer inspection time to determine the attack’s type, leadingto a lower supply of attention resources. Proposition 2 char-acterizes the PPoA that the ADL and the risk of IDoS attacksdepend on the product of the supply and demand of attentionresources for any stationary AM strategy σ ∈ Σ.

C. Fundamental Limits under AM strategies

Section VI-C aims to show the fundamental limits of theIDoS attack’s ADL, the ECoC, and the risk under differ-ent AM strategies. Define the shorthand notation: p(sIh) :=∑φ

Ih∈Φ ∑i∈{FE,RE}Pr(θi,φIh |sIh)pCG(sIh ,θi,φ

Ih).

Lemma 1. If Condition 1 holds and M→ ∞, then for eachsIh ∈S , the ADL pUN(sIh ,am) decreases strictly to p(sIh) asm increases.

Proof. Since 1n! e−βd(sIh ,θ Ih ,φ Ih ))(βd(sIh ,θ Ih ,φ Ih))n > 0

for all m ∈ {0, · · · ,M}, the value of pUN(sIh ,am) in(10) strictly decreases as m increases. Moreover, sincelimm→∞ ∑

mn=0

1n! e−βd(sIh ,θ Ih ,φ Ih ))(βd(sIh ,θ Ih ,φ Ih))n = 1, we

have minm∈{0,··· ,M} pUN(sIh ,am) = p(sIh) for all sIh ∈S .

Remark 2 (Fundamental Limit of ADL). Lemma 1 charac-terizes that the minimum ADL under all AM strategies am ∈Ais p(sIh). The value of p(sIh) depends on both the operator’scapacity gap pCG(sIh ,θFE ,φ

Ih) and the frequency of feint andreal attacks with different targets, i.e., Pr(θ Ih ,φ Ih |sIh),∀θ Ih ∈Θ,φ Ih ∈Φ.

Denote the expected reward of making a complete alertresponse (i.e., the rewards to dismiss feints and escalate realattacks) as

λ (sIh ,m,φ Ih) = ∑i∈{FE,RE}

c(wi,sIh) ·Pr(θi,φIh |sIh)

· phSP(s

Ih ,θi,φIh) · [

m

∑n=0

1n!

e−βd(sIh ,θi,φIh )(βd(sIh ,θi,φ

Ih))n].

Combining (10) and (9), we can rewrite ECoC as a combi-nation of the following three terms in (11).

c(sIh ,am) = pUN(sIh ,am)c(wUN ,sIh)

+mc(wNI ,sIh)+ ∑φ

Ih∈Φ

λ (sIh ,m,φ Ih). (11)

Based on Lemma 1, the first term pUN(sIh ,am)c(wUN ,sIh) andthe third term ∑φ

Ih∈Φλ (sIh ,m,φ Ih) decrease in m while the

second term mc(wNI ,sIh) in (11) increases in m linearly atthe rate of c(wNI ,sIh). The tradeoff among the three terms issummarized below.

Remark 3 (Tradeoff among ADL, Reward of Alert Atten-tion, and Impact for Alert Inattention). Based on Lemma 1and (11), increasing m reduces the ADL, and achieves a higherreward of completing the alert response. However, the increaseof m also linearly increases the impact for alert inattentionrepresented by mc(wNI ,sIh), the cost of uninspected alerts.Thus, we need to strike a balance among these terms to reducethe IDoS risk.

Define λmin(sIh ,φ Ih) := ∑i∈{FE,RE} c(wi,sIh)Pr(θi,φIh |sIh)

phSP(s

Ih ,θi,φIh), λ

ε0max(sIh ,φ Ih) := (1 − ε0)λmin(sIh ,φ Ih),

cmin(sIh) := ∑φIh∈Φ

λmin(sIh ,φ Ih) + p(sIh)c(wUN ,sIh)

+ mc(wNI ,sIh), and cε0max(sIh) := ∑φ

Ih∈Φλ

ε0max(sIh ,φ Ih) +

[p(sIh)+ ε0(1− p(sIh))]c(wUN ,sIh)+mc(wNI ,sIh).

Proposition 3. Consider the scenario where Condition 1 holdsand M > m(sIh). For any ε0 ∈ (0,1] and sIh ∈S , there existsm(sIh) ∈ Z+ such that c(sIh ,am) ∈ [cmin(sIh),cε0

max(sIh)],∀am ∈A , when m≥m(sIh). Moreover, the lower bound cmin(sIh) andthe upper bound cε0

max(sIh) increase in m linearly at the samerate c(wNI ,sIh).

Proof. For any ε0 ∈ (0,1], there exists m(sIh) ∈ Z+ such that∑

mn=0

1n! e−βd(sIh ,θ Ih ,φ Ih )(βd(sIh ,θ Ih ,φ Ih))n ∈ [1 − ε0,1] when

m ≥ m(sIh). Based on Lemma 1, if m > m(sIh), thenpUN(sIh ,am) ∈ [p(sIh), p(sIh)+ε0(1− p(sIh))]. Plugging it into(11), we obtain the results.

Let σm ∈ Σ denote the AM strategy that chooses to de-emphasize the next m ≥ m(sIh) alerts for all category labelsIh ∈S . The monotonicity of the Bellman operator [34] leadsto the following corollary.

Corollary 1. Consider the scenario where Condition 1 holdsand M > m(sIh). For any ε0 ∈ (0,1] and sIh ∈S , the upperand lower bounds of the risk u(sIh ,σm) increase in m linearlyat the same rate of c(wNI ,sIh).

Remark 4 (Fundamental Limit of ECoC and Risk). Propo-sition 3 and Corollary 1 show that the maximum length ofthe de-emphasized alerts for any sIh ∈S should not exceedm(shm) to reduce the ECoC and the risk of IDoS attacks.

VII. CASE STUDY

The following section presents case studies to demonstratethe impact of IDoS attacks on human operators’ alert inspec-tions and alert responses, and further illustrate the effectivenessof RADAMS. Throughout the section, we adopt the attentionmodel in Section IV.

A. Experiment Setup

We consider an IDoS attack targeting either the Pro-grammable Logic Controllers (PLCs) in the physical layer orthe data centers in the cyber layer of an ICS. We denoted thesetwo targets as φP and φC, respectively. They constitute the bi-nary set of attack targets Φ= {φP,φC} defined in Section III-A.The SOC of the ICS is in charge of monitoring, inspecting,and responding to both the cyber and the physical alerts. Weconsider two system-level metrics defined in Section III-B,the source SSO = {sSO,P,sSO,C} and the criticality SCR ={sCR,L,sCR,H}, i.e., S = SSO ×SCR. Let sSO,P and sSO,Crepresent the source of physical and cyber layers, respectively.We assume that the alert triage process can accurately identifythe source of attacks, i.e., Pr(sSO,i|φ j) = 1{i= j},∀i, j ∈ {P,C}.Let sCR,L and sCR,H represent low and high criticality, respec-tively. We assume that the triage process cannot accuratelyidentify feints as low criticality and real attacks as highcriticality. The revelation kernel is separable and takes the form

10

of o(sSO,sCR|θi,φ j) = Pr(sSO|φ j) ·Pr(sCR|θi),sSO ∈SSO,sCR ∈SCR, i ∈ {FE,RE}, j ∈ {P,C}. We choose the values of o sothat the attack is more likely to be feint (resp. real) when thecriticality level is low (resp. high).

The inter-arrival time at attack stage k ∈ Z0+ follows anexponential distribution with rate β (θ k,θ k+1) parameterizedby the attack’s type θ k,θ k+1. Thus, the average inter-arrivaltime µ(θ k,θ k+1) := 1/β (θ k,θ k+1) also depends on the at-tack’s type at the current and the next attack stages as shownin Table I. We choose the benchmark values based on theliterature (e.g., [18], [35] and the references within) and attackscan change these values in different IDoS attacks.

TABLE I: Benchmark values of the average inter-arrival timeµ(θ k,θ k+1) = 1/β (θ k,θ k+1),∀θ k,θ k+1 ∈Θ.

Average inter-arrival time from feints to real attacks 6sAverage inter-arrival time from real attacks to feints 10s

Average inter-arrival time between feints 15sAverage inter-arrival time between real attacks 8s

The average inspection time d in Section IV-C depends onthe criticality sk

CR and attack’s type θ k at attack stage k ∈ Z0+

as shown in Table II. We choose the benchmark values ofd(sk

CR,θk) based on [18], and these values can change for

different human operators and IDoS attacks. We add a randomnoise uniformly distributed in [−5,5] to the average inspectiontime to simulate the AITN.

TABLE II: Benchmark values of the average inspection timed(sk

CR,θk),∀θ k ∈Θ,sk

CR ∈SCR.

Average time to inspect feints of low criticality 6sAverage time to inspect feints of high criticality 8s

Average time to inspect real attacks of low criticality 15sAverage time to inspect real attacks of high criticality 20s

The stage cost c(wk,skSO) at attack stage k ∈ Z0+ in Section

V-B depends on the alert response wk ∈ W and the sourcesk

SO ∈SSO. We determine the benchmark values of c(wk,skSO)

per alert in Table III based on the salary of the SOC analystsand the estimated loss of the associated attacks.

TABLE III: The benchmark values of the stage costc(wk,sk

SO),∀wk ∈W ,skSO ∈SSO.

Reward of dismissing feints wFE $80Reward of identifying real attacks wRE in physical layer $500

Reward of identifying real attacks wRE in cyber layer $100Cost of incomplete alert response wUN or wNI $300

B. Analysis of Numerical Results

We plot the dynamics of the operator’s alert responses inFig. 3 under the benchmark experiment setup in Section VII-A.We use green, purple, orange, and yellow to represent wUN ,wNI , wFE , and wRE , respectively. The heights of squares arealso used to distinguish the four categories.

1) Adaptive Learning during the Real-Time Monitoring andInspection: Based on Algorithm 1, we illustrate the learningprocess of the estimated ECuC Qh(sIh ,ah) for all sIh ∈S andah ∈A at each inspection stage h ∈ Z0+ in Fig. 4. We chooseαh(sIh ,ah)= kc

kT I(sIh )−1+kcas the learning rate where kc ∈ (0,∞)

Time (s)

Attack's

Type

Fig. 3: Alert response wk ∈ W for the k-th attack whose type isshown in the y-axis. The k-th vertical dash line represents the k-thalert’s arrival time tk.

is a constant parameter and kT I(sIh) ∈ Z0+ is the number ofvisits to sIh ∈S up to stage h ∈ Z0+. Here, the AM actionah is implemented randomly at each inspection stage h, i.e.,ε = 1. Thus, all four AM actions (M = 3) are explored equallyon average for each sIh ∈ S as shown in Fig. 4. Since thenumber of visits to different category labels depends on thetransition probability κAT , the learning stages for four categorylabels are of different lengths.

We denote category labels (sSO,P,sCR,L), (sSO,P,sCR,H),(sSO,C,sCR,L), and (sSO,C,sCR,H) in blue, red, green, and black,respectively. To distinguish four AM actions, a deeper colorrepresents a larger m ∈ {0,1,2,3} for each category labelsSO,i,sCR, j, i∈{P,C}, j∈{H,L}. The inset black box magnifiesthe selected area. The optimal strategy σ∗ ∈ Σ is to take a3for all category labels. The risk v∗(sIh) = u(sIh ,σ∗) underthe optimal strategy has the approximated values of $1153,$1221, $1154, and $1358 for the above category labels inblue, red, green, and black, respectively. We also simulatethe operator’s real-time monitoring and inspection under IDoSattacks when AM strategy is not applied based on Algorithm1. The risks v0(sIh) := u(sIh ,σ0) under the default AM strategyσ0 ∈ Σ have the approximated values of $1377, $1527, $1378,and $1620 for the category label (sSO,P,sCR,L), (sSO,P,sCR,H),(sSO,C,sCR,L), and (sSO,C,sCR,H), respectively. These resultsillustrate that the optimal AM strategy σ∗ ∈ Σ can significantlyreduce the risk under IDoS attacks for all category labels andthe reduction percentage can be as high as 20%.

We further investigate the IDoS risk under the optimal AMstrategy σ∗ as follows. As illustrated in Fig. 4, when thecriticality level is high (i.e., the attack is more likely to be real),the attacks targeting cyber layers (denoted in black) result ina higher risk than the one targeting physical layers (denotedin red). This asymmetry results from the different rewards ofidentifying real attacks in physical or cyber layers denotedin Table III. Since dismissing feints bring the same rewardin physical and cyber layers, the attacks targeting physical orcyber layers result in similar IDoS risks when the criticalitylevel is low. Within physical or cyber layers, high-criticalityalerts result in a higher risk than low-criticality alerts do.

The value of Qh(sIh ,am),m ∈ {0,1,2}, represents the riskwhen RADAMS deviates to sub-optimal AM action am for

11

0 0.5 1 1.5 2 2.5 3 3.5

Number of Learning Stages 106

900

950

1000

1050

1100

1150

1200

1250

1300

1350

1400

1450

Co

st 1145

1150

1155

1160

Fig. 4: The convergence of the estimated ECuC Qh(sIh ,ah) vs. thenumber of inspection stages.

a single category label sIh ∈ S . As illustrated by the redand black lines in Fig. 4, this single deviation can increasethe risk under alerts of high criticality. However, it hardlyincreases the risk under alerts of low criticality as illustratedby the green and blue lines in the inset black box of Fig. 4.These results illustrate that we can deviate from the optimalAM strategy to sub-optimal ones for some category labelswith approximately equivalent risk, which we refer to as theattentional risk equivalency in Remark 5.

Remark 5 (Attentional Risk Equivalency). The above re-sults illustrate that we can contain the IDoS risk by selectingproper sub-optimal strategies. If applying the optimal AMstrategy σ∗ is costly, then RADAMS can choose not to applyAM strategy for (sSO,C,sCR,L) or (sSO,P,sCR,L) without signifi-cantly increasing the IDoS risks.

2) Optimal AM Strategy and Resilience Margin underDifferent Stage Costs: We define resilience margin as thedifference of the risks under the optimal and the default AMstrategies. We investigate how the cost of incomplete alertresponse in Table III affects the optimal AM strategy and theresilience margin in Fig. 5.

As shown in the upper figure, the optimal strategy remainsto choose AM action a3 when the alert is of high criticality.When the alert is of low criticality, then as the cost increases,the optimal AM strategy changes sequentially from a3, a2, anda1 to a0; i.e., RADAMS gradually decreases m ∈ {0,1,2,3},the number of de-emphasized alerts. As shown in the lowerfigure, the resilience margin increases monotonously with thecost. The optimal strategy for alerts of high criticality yieldsa larger resilience margin than the one for low criticality.

Remark 6 (Tradeoff of Monitoring and Inspection). Theresults show that the optimal strategy strikes a balance be-tween real-time monitoring a large number of alerts andinspecting selected alerts with high quality. Moreover, theoptimal strategy is resilient for a large range of cost values([$0,$1000]). If the cost is high and the alert is of low

0 200 400 600 800 1000

200

400

600

Fig. 5: The optimal AM strategy and the risk vs. the cost ofan incomplete alert response under category label (sSO,P,sCR,L),(sSO,P,sCR,H ), (sSO,C,sCR,L), and (sSO,C,sCR,H ) in solid red, solidgreen, dashed yellow, and dashed green, respectively.

(resp. high) criticality, then the optimal strategy encouragesmonitoring (resp. inspecting) by choosing a small (resp. large)m. However, when the cost of an incomplete alert responseis relatively low, the optimal strategy is a4 for all alertsas the high-quality inspection outweighs the high-quantitymonitoring.

3) Arrival Frequency of IDoS Attacks: As stated in SectionIII-A, feint attacks with the goal of triggering alerts requirefewer resources to craft. Thus, we let cRE = $0.04 andcFE ∈ (0, cRE) denote the cost to generate a real attack anda feint, respectively. With cRE and cFE , we can compute theattack cost of feint and real attacks per work shift of 24 hours.Let ρ be the scaling factor for the arrival frequency and inSection VII-B3, the average inter-arrival time is µ(θ k,θ k+1) =ρµ(θ k,θ k+1),∀θ k,θ k+1 ∈ Θ. We investigate how the scalefactor ρ ∈ (0,2.5] affects the IDoS risk and the attack costin Fig. 6. As ρ decreases, the attacker generates feint andreal attacks at a higher frequency. Then, the risks under boththe optimal and the default strategies increase. However, theoptimal AM strategy can reduce the increase rate for a largerange of ρ ∈ [0.5,2].

0 0.5 1 1.5 2 2.50

2000

4000

6000

8000

Co

st

or

Ris

k (

$)

Fig. 6: IDoS risk vs. ρ under the optimal and the default AMstrategies in solid red and dashed blue, respectively. The black linerepresents the attack cost per work shift of 24 hours.

Remark 7 (Attacker’s Dilemma). From the attacker’s per-spective, although increasing the attack frequency can induce

12

a high risk to the organization and the attacker can gainfrom it, the frequency increase also increases the attack costexponentially as shown by the dotted black line in Fig. 6. Thus,the attacker has to strike a balance between the attack costand the attack gain (represented by the IDoS risk). Moreover,attackers with a limited budget are not capable to choose smallvalues of ρ (i.e., high attack frequencies).

4) Percentage of Feint and Real Attacks: Consider the casewhere κAT independently generates feints and real attacks withprobability ηFE and ηRE = 1−ηFE , respectively. We considerthe case where the attacker has a limited budge cmax = $270per work shift (i.e., 86400s) and generates feint and real attacksat the same rate β , i.e., β (θ k,θ k+1) = β ,∀θ k,θ k+1 ∈ Θ.Consider the attack cost in Section VII-B3, the attacker hasthe following budget constraint, i.e.,

86400 · β · (ηFE cFE +ηRE cRE)≤ cmax. (12)

The budget constraint results in the following tradeoff. If theattacker chooses to increase the probability of real attack ηRE ,then he has to reduce the arrival frequency β of feint and realattacks. We investigate how the probability of feints affectsthe IDoS risk in Fig. 7 under the optimal and the default AMstrategies in red and blue, respectively. The feints are of lowand high costs in Fig. 7a and 7b, respectively.

0 0.5 10

2000

4000

6000

Ris

k (

$)

(a) Low-cost feints cFE = 1.

0 0.5 10

1000

2000

3000

Ris

k (

$)

(b) High-cost feints cFE = 5.

Fig. 7: IDoS risk vs. ηFE ∈ [0,1] under the optimal and the defaultAM strategies in red and blue, respectively. The black line representsthe resilience margin.

As shown in Fig. 7a, when the feints are of low cost,i.e., cFE = cRE/10, generating feints with a higher probabilitymonotonously increases the IDoS risks for both AM strate-gies. When the probability of feints is higher than 80%, theresilience margin is zero; i.e., the optimal and the default AMstrategies both induce high risks. However, as the probabilityof feint decreases, the resilience margin increases to around$500; i.e., the default strategy can moderately reduce the riskbut the optimal strategy can excessively reduce the risk.

Remark 8 (Half-Truth Attack for High-Cost Feints). Asshown in Fig. 7b, when the feints are of high cost, i.e.,cFE = cRE/2, then the optimal attack strategy is to deceivewith half-truth, i.e., generating feint and real attacks withapproximately equal probability to induce the maximum IDoSrisk. As the probability of feints decreases from ηFE = 1, therisk increases significantly under the default AM strategy butmoderately under the optimal one.

The figures in Fig. 7 show that the optimal attack strategyunder the budget constraint (12) needs to adapt to the cost of

feint generation. Regardless of the attack strategy, the optimalAM strategy can reduce the risk and achieve a positive resilientmargin for all category labels (sSO,i,sCR, j), i ∈ {P,C}, j ∈{L,H}. Moreover, higher feint generation cost reduces thearrival frequency of IDoS attacks due to (12). Thus, comparingto Fig. 7a, the risk in Fig. 7b is lower for the same ηFE underthe optimal or the default AM strategies, especially when ηFEis close to 1.

5) The Operator’s Attention Capacity: We consider thefollowing attention function fLOE ◦ fSL with a constant atten-tion threshold, i.e., n(yEL,sk) = n0,∀yEL,sk ∈S . Consider thefollowing trapezoid attention function. If nt ≤ n0, the LOEω t = 1; i.e., the operator can retain the high LOE when thenumber of distractions is less than the attention threshold n0.If nt > n0, the LOE ω t gradually decreases as nt increases.Then, a larger value of n0 indicates a high attention capacity.We investigate how the value of n0 affects the risk in Fig. 8.

0 1 2 3 4 50

500

1000

1500

Ris

k (

$)

0 1 2 3 4 50

500

1000

1500

0 1 2 3 4 50

500

1000

1500

Ris

k (

$)

0 1 2 3 4 50

500

1000

1500

Fig. 8: Risk vs. attention threshold under the optimal and the defaultAM strategies in red and blue, respectively. The black dotted linerepresents the resilience margin.

As the operator’s attention capacity increases, the risks un-der the optimal and the default AM strategies decrease for allcategory labels. The resilience margin decreases from around$200 to $50 as n0 increases from 0 to 2 and then maintains thevalue of around $50. Thus, the optimal strategy suits operatorswith a large range of attention capacity, especially for the oneswith limited attention capacity.

VIII. CONCLUSION

Attentional human vulnerabilities exploited by attackerslead to a new class of proactive attacks called the InformationalDenial-of-Service (IDoS) attacks. IDoS attacks generate alarge number of feint attacks on purpose to deplete the limitedhuman attention resources and exacerbate the alert fatigueproblem. In this work, we have formally defined IDoS attacksas a sequence of feint and real attacks of heterogeneous targets,which can be characterized by the Markov renewal process.We have abstracted the alert generation and triage processesas a revelation probability to establish a stochastic relationship

13

between the IDoS attack’s hidden types and targets and the as-sociated alert’s observable category labels. We have explicitlyincorporated human factors (e.g., levels of expertise, stress,and efficiency) and empirical results (e.g., the Yerkes–Dodsonlaw and the sunk cost fallacy) to model the operators’ attentiondynamics and the processes of alert monitoring, inspection,and response in real time. Based on the system-scientifichuman attention and alert response model, we have developeda Resilient and Adaptive Data-driven alert and AttentionManagement Strategy (RADAMS) to assist human operatorsin combating IDoS attacks. We have proposed a ReinforcementLearning (RL)-based algorithm to obtain the optimal assistivestrategy according to the costs of the operator’s alert responsesin real time.

Through theoretical analysis, we have observed the ProductPrinciple of Attention (PPoA), the fundamental limits ofAttentional Deficiency Level (ADL) and risk, and tradeoffamong the ADL, the reward of alert attention, and the impactof alert inattention. Through the experimental results, wehave corroborated the effectiveness, adaptiveness, robustness,and resilience of the proposed assistive strategies. First, theoptimal AM strategy outperforms the default strategy andcan effectively reduce the IDoS risk by as much as 20%.Second, the strategy adapts to different category labels tostrike a balance of monitoring and inspections. Third, theoptimal AM strategy is robust to deviations. We can apply sub-optimal strategies at some category labels without significantlyincreasing the IDoS risk. Finally, the optimal AM strategy isresilient to a large variations of costs, attack frequencies, andhuman attention capacities.

REFERENCES

[1] L. Huang and Q. Zhu, “Combating informational denial-of-service(IDoS) attacks: Modeling and mitigation of attentional human vulnera-bility,” in International Conference on Decision and Game Theory forSecurity. Springer, 2021.

[2] G. Bassett, C. D. Hylender, P. Langlois, A. Pinto, and S. Widup, “Databreach investigations report,” Verizon DBIR Team, Tech. Rep., 2021.

[3] Tessian, “The psychology of human error,” Tech. Rep., 2020.[4] B. Hitzel. (2019) The art of cyber war and cyber battle:

Deception operations. https://www.networkdefenseblog.com/post/art-of-cyber-war-deception.

[5] R. M. Yerkes, J. D. Dodson et al., “The relation of strength of stimulusto rapidity of habit-formation,” Punishment: Issues and experiments, pp.27–41, 1908.

[6] H. R. Arkes and C. Blumer, “The psychology of sunk cost,” Organi-zational behavior and human decision processes, vol. 35, no. 1, pp.124–140, 1985.

[7] L. Huang and Q. Zhu, “A dynamic games approach to proactive defensestrategies against advanced persistent threats in cyber-physical systems,”Computers & Security, vol. 89, p. 101660, 2020.

[8] ——, “Adaptive honeypot engagement through reinforcement learningof semi-markov decision processes,” in International Conference onDecision and Game Theory for Security. Springer, 2019, pp. 196–216.

[9] ——, “Farsighted risk mitigation of lateral movement using dynamiccognitive honeypots,” in International Conference on Decision andGame Theory for Security. Springer, 2020, pp. 125–146.

[10] S. Jajodia, A. K. Ghosh, V. Swarup, C. Wang, and X. S. Wang, Mov-ing target defense: creating asymmetric uncertainty for cyber threats.Springer Science & Business Media, 2011, vol. 54.

[11] W. Casey, J. A. Morales, E. Wright, Q. Zhu, and B. Mishra, “Compliancesignaling games: toward modeling the deterrence of insider threats,”Computational and Mathematical Organization Theory, vol. 22, no. 3,pp. 318–349, 2016.

[12] D. Liu, X. Wang, and L. J. Camp, “Mitigating inadvertent insider threatswith incentives,” in International Conference on Financial Cryptographyand Data Security. Springer, 2009, pp. 1–16.

[13] L. Huang and Q. Zhu, “Duplicity games for deception design withan application to insider threat mitigation,” IEEE Transactions onInformation Forensics and Security, vol. 16, pp. 4843–4856, 2021.

[14] G. Spathoulas and S. Katsikas, “Reducing false positives in intrusiondetection systems,” Comput & Secur, vol. 29, no. 1, pp. 35–44, 2010.

[15] H. Elshoush and I. Osman, “Reducing false positives through fuzzy alertcorrelation in collaborative intelligent intrusion detection systems—areview,” in IEEE Int. Conf. Fuzzy Syst. IEEE, 2010, pp. 1–8.

[16] K. Goeschel, “Reducing false positives in intrusion detection systemsusing data-mining techniques utilizing support vector machines, decisiontrees, and naive bayes for off-line analysis,” in SoutheastCon 2016.IEEE, 2016, pp. 1–6.

[17] S. Salah, G. Macia-Fernandez, and J. E. Dıaz-Verdejo, “A model-basedsurvey of alert correlation techniques,” Computer Networks, vol. 57,no. 5, pp. 1289–1317, 2013.

[18] A. Shah, R. Ganesan, S. Jajodia, and H. Cam, “A two-step approach tooptimal selection of alerts for investigation in a csoc,” IEEE Trans. Inf.Forensics Secur., vol. 14, no. 7, pp. 1857–1870, 2019.

[19] E. A. Newcomb, R. J. Hammell, and S. Hutchinson, “Effective prioriti-zation of network intrusion alerts to enhance situational awareness,” in2016 IEEE Conference on Intelligence and Security Informatics (ISI).IEEE, 2016, pp. 73–78.

[20] S. McElwee, J. Heaton, J. Fraley, and J. Cannady, “Deep learningfor prioritizing and responding to intrusion detection alerts,” in IEEEMilitary Communications Conference. IEEE, 2017, pp. 1–5.

[21] R. Ganesan, S. Jajodia, A. Shah, and H. Cam, “Dynamic scheduling ofcybersecurity analysts for minimizing risk using reinforcement learning,”ACM Transactions on Intelligent Systems and Technology (TIST), vol. 8,no. 1, pp. 1–21, 2016.

[22] I. Corona, G. Giacinto, and F. Roli, “Adversarial attacks against intrusiondetection systems: Taxonomy, solutions and open issues,” InformationSciences, vol. 239, pp. 201–225, 2013.

[23] D. Mutz, G. Vigna, and R. Kemmerer, “An experience developing anids stimulator for the black-box testing of network intrusion detectionsystems,” in 19th Annual Computer Security Applications Conference,2003. Proceedings. IEEE, 2003, pp. 374–383.

[24] S. Patton, W. Yurcik, and D. Doss, “An achilles’ heel in signature-basedids: Squealing false positives in snort,” in Proceedings of RAID, vol.2001. Citeseer, 2001.

[25] M. Roesch et al., “Snort: Lightweight intrusion detection for networks.”in Lisa, vol. 99, no. 1, 1999, pp. 229–238.

[26] Z. Wang, H. Zhu, and L. Sun, “Social engineering in cybersecurity:Effect mechanisms, human vulnerabilities and attack methods,” IEEEAccess, vol. 9, pp. 11 895–11 910, 2021.

[27] S. Miserendino, C. Maynard, and J. Davis, “Threatvectors: Contextualworkflows and visualizations for rapid cyber event triage,” in 2017International Conference On Cyber Incident Response, Coordination,Containment & Control (Cyber Incident). IEEE, 2017, pp. 1–8.

[28] L. Franklin, M. Pirrung, L. Blaha, M. Dowling, and M. Feng, “Towarda visualization-supported workflow for cyber alert management usingthreat models and human-centered design,” in 2017 IEEE Symposiumon Visualization for Cyber Security (VizSec). IEEE, 2017, pp. 1–8.

[29] L. Huang and Q. Zhu, “Inadvert: An interactive and adaptive counter-deception platform for attention enhancement and phishing prevention,”arXiv preprint arXiv:2106.06907, 2021.

[30] S. C. Sundaramurthy, A. G. Bardas, J. Case, X. Ou, M. Wesch,J. McHugh, and S. R. Rajagopalan, “A human capital model formitigating security analyst burnout,” in Eleventh Symposium On UsablePrivacy and Security ({SOUPS} 2015), 2015, pp. 347–359.

[31] C. Zimmerman, “Ten strategies of a world-class cybersecurity operationscenter,” The MITRE Corporation, 2014.

[32] R. D. Yates, Y. Sun, D. R. Brown, S. K. Kaul, E. H. Modiano, andS. Ulukus, “Age of information: An introduction and survey,” IEEE J.Sel. Areas Commun., vol. 39, pp. 1183–1210, 2021.

[33] J. S. Ancker, A. Edwards, S. Nosal, D. Hauser, E. Mauer, andR. Kaushal, “Effects of workload, work complexity, and repeated alertson alert fatigue in a clinical decision support system,” BMC medicalinformatics and decision making, vol. 17, no. 1, pp. 1–9, 2017.

[34] D. P. Bertsekas and J. N. Tsitsiklis, Neuro-dynamic programming.Athena Scientific, 1996.

[35] A. Shah, R. Ganesan, S. Jajodia, and H. Cam, “Understanding tradeoffsbetween throughput, quality, and cost of alert analysis in a csoc,” IEEETransactions on Information Forensics and Security, vol. 14, no. 5, pp.1155–1170, 2019.