an efficient marginal-return-based constructive heuristic

12
2536 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019 An Efficient Marginal-Return-Based Constructive Heuristic to Solve the Sensor–Weapon–Target Assignment Problem Bin Xin , Member, IEEE, Yipeng Wang , and Jie Chen, Senior Member, IEEE Abstract—In network-centric warfare, the interconnections among various combat resources enable an advanced operational pattern of cooperative engagement. The operational effectiveness and outcome strongly depends on the reasonable utilization of available sensors and weapons. In this paper, a mathematical model for the coallocation of sensors and weapons is built, tak- ing into account the interdependencies between weapons and sensors, the resource constraints, the capability constraints, as well as the strategy constraints. A marginal-return-based con- structive heuristic (MRBCH) is proposed to solve the formulated sensor–weapon–target assignment (S-WTA) problem. MRBCH exploits the marginal return of each sensor–weapon–target triplet and dynamically updates the threat value of all targets. It relies only on simple lookup operations to choose each assignment triplet, thus resulting in very low computational complexity. For performance evaluation, we build a general Monte Carlo simulation-based S-WTA framework. Furthermore, we employ a random sampling method and an extension of the state-of- the-art algorithm Swt_opt as competitors. The computational results show that MRBCH consistently performs very well in solving S-WTA instances of different scales, and it can generate assignment schemes much more efficiently than its competitors. Index Terms—Co-allocation, constructive heuristics, cooper- ative engagement, sensor–weapon–target assignment (S-WTA), Swt_opt algorithm. I. I NTRODUCTION I N network-centric warfare (NCW), distributed combat resources have to be coordinated to engender cooperative Manuscript received May 6, 2017; revised September 20, 2017; accepted December 7, 2017. Date of publication January 8, 2018; date of current version November 19, 2019. This work was supported in part by the National Natural Science Foundation of China under Grant 61673058, in part by the NSFC- Zhejiang Joint Fund for the Integration of Industrialization and Informatization under Grant U1609214, in part by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China under Grant 61621063, in part by the Projects of Major International (Regional) Joint Research Program of NSFC under Grant 61720106011, and in part by the Beijing Outstanding Ph.D. Program Mentor under Grant 20131000704. This paper was recommended by Associate Editor L. Fang. (Corresponding author: Bin Xin.) B. Xin and J. Chen are with the School of Automation, Beijing Institute of Technology, Beijing 100081, China, with the Key Laboratory of Intelligent Control and Decision of Complex Systems, Beijing Institute of Technology, Beijing 100081, China, and also with the Beijing Advanced Innovation Center for Intelligent Robots and Systems, Beijing Institute of Technology, Beijing 100081, China (e-mail: [email protected]). Y. Wang is with the School of Automation, Beijing Institute of Technology, Beijing 100081, China. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMC.2017.2784187 engagement capability (CEC) [1]–[3]. Particularly, the coordi- nation of sensors and weapons plays a pivotal role in enabling CEC in modern and future combat systems. As two kinds of major resources in combat systems, weapons and sensors hold intricate relations. From the classic OODA loop devel- oped by Boyd which refers to a decision cycle of observe, orient, decide, and act, sensors and weapons play the roles of “observer” and “actuator,” respectively. In essence, the OODA loop implies a complex closed-loop command and control system composed of sensors (“observe”), controllers (“orient + decide”), and actuators (“act”), with weapons being the major actuators in the combat system. From the information links in OODA loop, the following three types of sensor–weapon relations can be identified. 1) Independence: In the information chain “OODA,” sensors are used to acquire battlefield infor- mation and detect targets, the collected information is processed and used to aid commanders to make decisions, and then weapons are used to strike the tar- gets. A corresponding metaphor is “Eye(observer)Brain(controller)Fist(actuator).” Due to the separation of the “brain,” the roles of sensors and weapons are rel- atively independent. Both sensors (eyes) and weapons (fists) are regulated by the commander (brain). 2) Attack Effect Evaluation: The corresponding informa- tion link can be represented as “A target −−−→O(observe),” with the succeeding link being a new OODA chain. Sensors in the feedback channel of the whole control system are used to acquire target information for evaluating the attack effect of weapons. 3) Firepower Guide: In this case, the effective operation of weapons relies on sensors. For example, target illumina- tion radars (sensors) are often used to guide the missiles (weapons) to attack the targets. The corresponding link can be represented as “O(observe) target −−−→A.” A metaphor for this case is responsive fist–eye coordination. In the NCW context where distributed sensors and weapons can be paired freely, the above sensor–weapon relations result in different resource allocation problems. In the first case, sensors and weapons are usually scheduled independently, which may lead to the sensor-target assignment (STA) prob- lems in reconnaissance and surveillance and the weapon–target assignment (WTA) problems in firepower distribution, respec- tively. The second case may lead to STA problems in damage 2168-2216 c 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: others

Post on 26-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Efficient Marginal-Return-Based Constructive Heuristic

2536 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

An Efficient Marginal-Return-Based ConstructiveHeuristic to Solve the Sensor–Weapon–Target

Assignment ProblemBin Xin , Member, IEEE, Yipeng Wang , and Jie Chen, Senior Member, IEEE

Abstract—In network-centric warfare, the interconnectionsamong various combat resources enable an advanced operationalpattern of cooperative engagement. The operational effectivenessand outcome strongly depends on the reasonable utilization ofavailable sensors and weapons. In this paper, a mathematicalmodel for the coallocation of sensors and weapons is built, tak-ing into account the interdependencies between weapons andsensors, the resource constraints, the capability constraints, aswell as the strategy constraints. A marginal-return-based con-structive heuristic (MRBCH) is proposed to solve the formulatedsensor–weapon–target assignment (S-WTA) problem. MRBCHexploits the marginal return of each sensor–weapon–target tripletand dynamically updates the threat value of all targets. It reliesonly on simple lookup operations to choose each assignmenttriplet, thus resulting in very low computational complexity.For performance evaluation, we build a general Monte Carlosimulation-based S-WTA framework. Furthermore, we employa random sampling method and an extension of the state-of-the-art algorithm Swt_opt as competitors. The computationalresults show that MRBCH consistently performs very well insolving S-WTA instances of different scales, and it can generateassignment schemes much more efficiently than its competitors.

Index Terms—Co-allocation, constructive heuristics, cooper-ative engagement, sensor–weapon–target assignment (S-WTA),Swt_opt algorithm.

I. INTRODUCTION

IN network-centric warfare (NCW), distributed combatresources have to be coordinated to engender cooperative

Manuscript received May 6, 2017; revised September 20, 2017; acceptedDecember 7, 2017. Date of publication January 8, 2018; date of current versionNovember 19, 2019. This work was supported in part by the National NaturalScience Foundation of China under Grant 61673058, in part by the NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Informatizationunder Grant U1609214, in part by the Foundation for Innovative ResearchGroups of the National Natural Science Foundation of China under Grant61621063, in part by the Projects of Major International (Regional) JointResearch Program of NSFC under Grant 61720106011, and in part by theBeijing Outstanding Ph.D. Program Mentor under Grant 20131000704. Thispaper was recommended by Associate Editor L. Fang. (Corresponding author:Bin Xin.)

B. Xin and J. Chen are with the School of Automation, Beijing Institute ofTechnology, Beijing 100081, China, with the Key Laboratory of IntelligentControl and Decision of Complex Systems, Beijing Institute of Technology,Beijing 100081, China, and also with the Beijing Advanced Innovation Centerfor Intelligent Robots and Systems, Beijing Institute of Technology, Beijing100081, China (e-mail: [email protected]).

Y. Wang is with the School of Automation, Beijing Institute of Technology,Beijing 100081, China.

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSMC.2017.2784187

engagement capability (CEC) [1]–[3]. Particularly, the coordi-nation of sensors and weapons plays a pivotal role in enablingCEC in modern and future combat systems. As two kindsof major resources in combat systems, weapons and sensorshold intricate relations. From the classic OODA loop devel-oped by Boyd which refers to a decision cycle of observe,orient, decide, and act, sensors and weapons play the roles of“observer” and “actuator,” respectively. In essence, the OODAloop implies a complex closed-loop command and controlsystem composed of sensors (“observe”), controllers (“orient +decide”), and actuators (“act”), with weapons being the majoractuators in the combat system.

From the information links in OODA loop, the followingthree types of sensor–weapon relations can be identified.

1) Independence: In the information chain “O→O→D→A,” sensors are used to acquire battlefield infor-mation and detect targets, the collected informationis processed and used to aid commanders to makedecisions, and then weapons are used to strike the tar-gets. A corresponding metaphor is “Eye(observer)→Brain(controller)→Fist(actuator).” Due to the separationof the “brain,” the roles of sensors and weapons are rel-atively independent. Both sensors (eyes) and weapons(fists) are regulated by the commander (brain).

2) Attack Effect Evaluation: The corresponding informa-

tion link can be represented as “Atarget−−−→O(observe),”

with the succeeding link being a new O→O→D→Achain. Sensors in the feedback channel of the wholecontrol system are used to acquire target informationfor evaluating the attack effect of weapons.

3) Firepower Guide: In this case, the effective operation ofweapons relies on sensors. For example, target illumina-tion radars (sensors) are often used to guide the missiles(weapons) to attack the targets. The corresponding link

can be represented as “O(observe)target−−−→A.” A metaphor

for this case is responsive fist–eye coordination.In the NCW context where distributed sensors and weapons

can be paired freely, the above sensor–weapon relations resultin different resource allocation problems. In the first case,sensors and weapons are usually scheduled independently,which may lead to the sensor-target assignment (STA) prob-lems in reconnaissance and surveillance and the weapon–targetassignment (WTA) problems in firepower distribution, respec-tively. The second case may lead to STA problems in damage

2168-2216 c© 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: An Efficient Marginal-Return-Based Constructive Heuristic

XIN et al.: EFFICIENT MRBCH TO SOLVE S-WTA PROBLEM 2537

Fig. 1. Combat scenario with networked resources.

assessment. In contrast, the third case results in coallocationof sensors and weapons, that is, the sensor-WTA (S-WTA)problem [4], which is exactly the focus of this paper.Fig. 1 depicts a combat scenario with networked sensors andweapons where the information collected by any sensor isshared by all weapons within the network.

In fact, the previous studies mainly focused on STAproblems [5]–[10] and WTA problems [11]–[20] separately.However, few studies were focused on the S-WTA problems.Several scholars considered the influences of sensors on fireallocation when solving specific WTA problems [21]–[26];however, these studies did not involve the allocation of sensorsyet. Actually, the mutual effect between STA and WTA deci-sions has been considered in recent research on the cooperativeengagement of air defense [24], [27].

As for the coallocation of smart weapons and sensors,Bogdanowicz and Coleman [4] established a simple S-WTAmodel, and put forward an algorithm based on the auctionalgorithm to solve the S-WTA problem for the first time.The problem they considered was simplified through the sen-sor/weapon/target decomposition to the following scenario:each sensor–weapon pair can be assigned to at most one target,and each target can be engaged by at most one sensor–weaponpair. In addition, they focused on a symmetric assignmentproblem with an input consisting of the same numbers ofsensors, weapons, and targets, along with the benefit matrix,and put forward an exact algorithm named Swt_opt for thisproblem. It has been proved in [28] that Swt_opt converges toan optimal solution within finite steps. This algorithm repre-sents the state-of-the-art for such simplified S-WTA problems.Besides, Li et al. [27] put forward an improved auction algo-rithm to solve the same problem. Moreover, Chen et al. [29]proposed a particle swarm optimization based on genetic oper-ators to solve a similar simplified S-WTA problem consideringthe situation where a target can be attacked by multipleweapons once but the other conditions stayed the same.

Admittedly, unpredictable events and uncertainties areinevitable in a dynamic environment of the S-WTA problem.Therefore, the S-WTA problem has to be resolved online in acyclic manner as indicated by the OODA loop, and the decision-making process is time-critical with the changes in problem

scale and parameters. Thus, the time efficiency of the S-WTAalgorithm is very crucial. A global optimizer may not be able toquickly respond to the rapid changes since the massive numberof iterations usually consume a significant amount of time. Incontrast, a fast, real-time heuristic is capable of solving theonline dynamic optimization problems owing to the use ofproblem-specific knowledge, which can provide high-qualitysolutions incorporating real-time information efficiently.

Moreover, a slew of successful applications have wit-nessed the effect and efficiency of problem-specific heuris-tics [30]–[32]. For example, Yu et al. [30] proposed a heuristicmethod incorporating the rescheduling order and four prior-ity dispatching rules to solve the steel-making and continuouscasting production rescheduling problem efficiently. The rulethat the charge with the earliest start-time will be rescheduledfirst was utilized to determine the rescheduling order. Besides,based on the delay minimization rule and priority rule,Li et al. [31] designed a greedy-rule-based heuristic algorithmto solve the trains rescheduling problem. In fact, the virtue ofthe Swt_opt algorithm proposed by Bogdanowicz [28] is alsooriginated from heuristic rules which simulate an auction pro-cess. Undoubtedly, it is necessary and promising to develop aheuristic method for solving the S-WTA problem efficiently.

In this paper, we propose an efficient marginal-return-basedconstructive heuristic (MRBCH) algorithm to solve S-WTAproblems by making use of the concept of maximum marginalreturn (MMR). The terminology “marginal return” originatesfrom economics: it refers to the additional yield resulting fromone unit increase in the use of variable inputs while otherinputs are held constant. In pioneering research on WTA,denBroeder et al. [42] introduced this concept and built anMMR algorithm to solve a static WTA problem with uni-form weapons (SWTA-U) [33], [34]. It was proved that theMMR algorithm can guarantee optimality in solving SWTA-Uproblems. In this paper, we exploit the MMR as problem-specific knowledge to construct desirable solutions to S-WTAproblems. The originality and contributions of this paper aresummarized as follows.

1) A mathematical programming model is built for theS-WTA problems arising from the coallocation of sen-sors and weapons in cooperative engagement withfirepower guide. The model not only reflects theinterdependence between sensor and weapons in deter-mining the operational effectiveness but also incorpo-rates different kinds of practical constraints.

2) The proposed algorithm, namely MRBCH, constructsa coallocation scheme incrementally by choosing onesensor–weapon–target (SWT) triplet each time. MRBCHexploits the problem-specific knowledge contained in themarginal returns of all the feasible SWT triplets by iter-atively choosing the triplet with MMR and updating thevalue of involved targets.

a) The worst-case time complexity of MRBCH isO(S·W·T ·min(S, W)) where S, W, and T denote thenumbers of sensors, weapons, and targets, respec-tively. Computational results demonstrate that theruntime performance and scalability of MRBCH issuperior to its competitors.

Page 3: An Efficient Marginal-Return-Based Constructive Heuristic

2538 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

TABLE INOTATION DECLARATION

b) To our best knowledge, MRBCH is the first algo-rithm that can efficiently solve moderately sizedinstances of the formulated S-WTA problems (e.g.,those involving 100 sensors, 90 weapons, and 80targets).

The rest of this paper is structured as follows. In Section II,we present the mathematical programming model for theS-WTA problem. In Section III, the rule-based heuristicalgorithm, i.e., MRBCH, is proposed for solving S-WTAproblems along with an analysis of its computational com-plexity. Section IV presents the experimental design for theperformance investigations on the proposed algorithm as wellas the comparative results. Finally, Section V concludes thispaper.

II. PROBLEM FORMULATION

The combat scenario considered in this paper is narratedas follows. At certain time, T incoming targets are identifiedand each one is assumed to have certain value of threat. Thedefender has W weapons (e.g., missiles) to intercept the tar-gets, and S sensors (e.g., radars) to track and illuminate thetargets so as to guide the weapons. S sensors and W weaponsare collocated in a network to protect a volume of space.Assume that each sensor can detect only one target at thesame time and each weapon can shoot only one target simul-taneously. Denote the STA scheme by Y = [yik]S×T where yik

is the STA variable regarding the ith sensor and the kth tar-get. yik = 1 if the ith sensor is assigned to the kth target, andyik = 0 otherwise. Denote the WTA scheme by Z = [zjk]W×T

where zjk is the WTA variable regarding the jth weapon andthe kth target. zjk = 1 if the jth weapon is assigned to thekth target, and zjk = 0 otherwise. For clarity, the notationsemployed in the context are also listed in Table I.

Denote by Pdes(k), Ps(k), and Pw(k) (resp.) the probabilityof destroying the kth target, the probability of capturing thetarget by the sensors used to track and illuminate it, and theconditional probability that the selected weapon or weaponmix destroys the target under effective guidance, respectively.Obviously, the following relation holds:

Pdes(k) = Ps(k)× Pw(k). (1)

We define two categories of events: E k,iC represents the

event that the kth target is captured by the ith sensor (k ∈{1, 2, . . . , T}, i ∈ {1, 2, . . . , S}) and E

k,jD represents the event

that the kth target is destroyed by the jth weapon under effec-tive guidance (k ∈ {1, 2, . . . , T}, j ∈ {1, 2, . . . , W}). Thefollowing two independence assumptions are made.

1) For any i1 and i2 with i1, i2 ∈ {1, 2, . . . , S} and i1 �= i2,E k,i1

C and E k,i2C are independent.

2) For any j1 and j2 with j1, j2 ∈ {1, 2, . . . , W} and j1 �= j2,E

k,j1D and E

k,j2D are independent.

When a set of sensors, denoted by Sk, are assigned to thekth target, the probability of capturing the target is

Ps(k) = 1−S∏

i=1

(1− pik)yik = 1−

i∈Sk

(1− pik) (2)

where pik is the probability that the kth target is captured bythe ith sensor (i ∈ {1, 2, . . . , S}; k ∈ {1, 2, . . . , T}) [15]. In this

Page 4: An Efficient Marginal-Return-Based Constructive Heuristic

XIN et al.: EFFICIENT MRBCH TO SOLVE S-WTA PROBLEM 2539

paper, we simply use the acquisition probability pik to indicatethe performance of the sensor which depends on the target’sattribute, the sensor’s capability, attribute, and coverage [35].

When a set of weapons, denoted by Wk, are assigned to thekth target, the probability that the kth target is destroyed bythe assigned weapons under effective guidance is

Pw(k) = 1−W∏

j=1

(1− qjk

)zjk = 1−∏

j∈Wk

(1− qjk

)(3)

where qjk is the probability that the kth target is destroyed bythe jth weapon (j ∈ {1, 2, . . . , W}; k ∈ {1, 2, . . . , T}) [15]. Inthis paper, we simply use the kill probability qjk to indicatethe performance of the weapon which depends on the target’sattribute, the weapon’s capability and attribute [18].

Remark 1: If a weapon is able to launch salvos of projectilesbased on available inventory to shoot one target, we treat thisas one “engagement” action. In this case, qjk represents thetotal destruction probability of the engagement correspondingto the jth weapon and the kth target.

Hence, the probability that the kth target is destroyed canbe calculated as follows:

Pdes(k) =⎡

⎣1−∏

i∈Sk

(1− pik)

⎣1−∏

j∈Wk

(1− qjk

)⎤

⎦. (4)

A. Objective Function

The objective of the considered S-WTA problem is tomaximize the expected effectiveness of destroying incom-ing targets. Thus, we employ the expected total value ofthe eliminated threat from incoming targets as the objectivefunction

J(Y, Z) =T∑

k=1

vk

[1−

S∏

i=1

(1− pik)yik

]⎡

⎣1−W∏

j=1

(1− qjk

)zjk

(5)

where J(·, ·) is the objective function and vk denotes the threatvalue of the kth target (k ∈ {1, 2, . . . , T}).

B. Constraints

The following four categories of constraints are included inthe S-WTA model:

T∑

k=1

yik ≤ 1 ∀i ∈ {1, 2, . . . , S} (6)

T∑

k=1

zjk ≤ 1 ∀j ∈ {1, 2, . . . , W} (7)

S∑

i=1

yik ≤ mk ∀k ∈ {1, 2, . . . , T} (8)

W∑

j=1

zjk ≤ nk ∀k ∈ {1, 2, . . . , T}. (9)

The constraint set (6) reflects the maximum tracking capac-ity of the sensors. As a simplification, we assume each sensor

may detect only one target. Similarly, the constraint set (7)reflects the capability of weapons of firing at multiple tar-gets at the same time. In this paper, we assume that everyweapon can shoot only one target at the same time, sincea majority of actual weapons can shoot only one target once.Besides, a weapon capable of engaging multiple targets simul-taneously can be treated as multiple separate weapons. Theconstraint sets (8) and (9) limit the sensor cost and the weaponcost for each target, respectively. The setting of mk andnk(k = 1, 2, . . . , T) usually depends on the evaluation of thethreat of incoming targets and the combat performance ofavailable sensors and weapons. For simplicity, in this paper,we set the value of mk and nk according to the value of threatvk. In other words, the targets with higher threat can consumemore resources.

C. Optimization Model for S-WTA

The optimization model for the S-WTA problem aforemen-tioned can be formulated as follows:

max J(Y, Z), s.t. (6)–(9) (10)

which is a typical constrained nonlinear 0-1 programmingproblem.

Remark 2: If the STA scheme Y has been given, then Ps(k)(∀k ∈ {1, 2, . . . , T}) becomes constant. Let v′k = vk × Ps(k),then the above S-WTA problem will degenerate into thefollowing WTA problem:

max J1(Z) =T∑

k=1

v′k

⎣1−W∏

j=1

(1− qjk

)zjk

⎦, s.t. (7) and (9)

(11)

or equivalently

min J2(Z) =T∑

k=1

v′kW∏

j=1

(1− qjk

)zjk , s.t. (7) and (9). (12)

In this sense, the WTA problem can be regarded asa special case of the S-WTA problem. Since Lloyd andWitsenhausen [36] have established the NP-completeness ofthe WTA problem, the above S-WTA problem is also NP-complete.

Remark 3: If mk = nk = 1 ∀k ∈ {1, 2, . . . , T} and S =W = T = n (n is a positive integer with n > 1), that is tosay, at most one sensor and at most one weapon are allowedto be assigned to one target, and each sensor can detect onlyone target at the same time, thus the S-WTA problem can besimplified as follows:

max J(Y, Z) =n∑

k=1

n∑

i=1

n∑

j=1

bijkyikzjk (13)

s.t.n∑

k=1

yik ≤ 1 ∀i ∈ {1, 2, . . . , n} (14)

n∑

k=1

zjk ≤ 1 ∀j ∈ {1, 2, . . . , n} (15)

Page 5: An Efficient Marginal-Return-Based Constructive Heuristic

2540 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

n∑

i=1

yik ≤ 1 ∀k ∈ {1, 2, . . . , n} (16)

n∑

j=1

zjk ≤ 1 ∀k ∈ {1, 2, . . . , n} (17)

where bijk = vk × pi,k × qj,k is the benefit of the SWTtriplet (i, j, k). In fact, the simplified problem implies thateach sensor and each weapon will be assigned exactlyonce and each target will be assigned to exactly one sen-sor and one weapon. Therefore, the simplified S-WTAproblem is essentially the same as the one put forward byBogdanowicz and Coleman [4], and Swt_opt is an exactalgorithm for solving the problem [28].

III. RULE-BASED HEURISTIC FOR S-WTA

A. Auxiliary Decision Matrix

As mentioned in Section I, what this paper focuses on isthe case of “firepower guide” where the effective operation ofweapons relies on sensors. In this situation, sensors are explic-itly used to guide weapons with an engage-on-remote policy(i.e., weapons require guidance from sensors, and they can useany sensor in the network, not just the one on their platform).We are not addressing the detection or tracking problem whichwould absolutely put value on observing targets even if we arenot currently trying to shoot at them. It can be seen from (5)that with respect to any target, if no sensor is assigned to it,implying that Ps(k) = 0 and thus Pdes(k) ≡ 0, then no weaponcan be assigned to it. On the other hand, if no weapon is used[i.e., Pw(k) = 0], then the use of any sensor for the targetwill have no contribution to the objective (Pdes(k) ≡ 0). Inview of these facts, we employ a 3-D matrix X = [xijk]S×W×T

as an auxiliary decision matrix where xijk corresponds to oneSWT triplet, xijk = 1 if the ith sensor and the jth weapon areassigned to the kth target, and xijk = 0 otherwise. In this way,each assignment for one target includes one sensor and oneweapon. However, as compared with X, the original decisionmatrices Y and Z describe the assignment scheme more explic-itly. To remove irrational schemes in which some weapons(respectively sensors) but no sensors (respectively weapons)are assigned to the same target, the connections between Y,Z, and X are specified as follows:

yik = maxj{xijk} yik = 1 if the ith sensor and one or more

weapons are assigned to the kth target;yik = 0 otherwise;

zjk = maxi{xijk} zjk = 1 if the jth weapon and one or more

sensors are assigned to the kth target; zjk =0 otherwise.

B. Constraints Handling

In order to guarantee the feasibility of the solutions gener-ated by the heuristic algorithm, we define some variables torecord the use of resources. First, for constraints in (6) and (7),we use NS = [nS(i)]1×S and NW = [nW(j)]1×W to recordthe number of times that every sensor and every weapon areused in the actual scheme, respectively. Besides, for constraintsin (8) and (9), NTS = [ntS(k)]1×T and NTW = [ntW(k)]1×T

are used to record the number of sensors and weapons whichare assigned to each target in the actual assignment scheme,respectively.

The following are the instructions for the variables men-tioned above, namely:

NS = [nS(i)]1×S nS(i) denotes the number of timesthat the ith sensor is used;

NW = [nW(j)]1×W nW(j) denotes the number of timesthat the jth weapon is used;

NTS = [ntS(k)]1×T ntS(k) denotes the number of sensorsassigned to target k;

NTW = [ntW(k)]1×T ntW(k) denotes the number ofweapons assigned to target k.

Each time a novel triplet (i, j, k) is determined to be addedinto the actual assignment scheme (i.e., the correspondingdecision variable xijk = 1), the variables in the aforementionedvectors are updated. The rules for handling the constraintsets (6)–(9) in the process of assignment are presented asfollows.

1) If nS(i) = 1, the ith sensor will not be assigned to othertargets any more.

2) If nW(j) = 1, the jth weapon will not be assigned toother target any more.

3) If ntS(k) = mk, no more sensors will be assigned to thekth target.

4) If ntW(k) = nk, no more weapons will be assigned tothe kth target.

At the outset of the solution construction process, an emptyauxiliary decision matrix will be generated, and all the abovevariables will be set to zero. Then, one SWT triplet will bechosen according to MMR-based rules (see Section III-C), andthe corresponding element of the auxiliary decision matrix willbe set to one. The corresponding variables for constraint han-dling will be updated. Then, more and more SWT tripletswill be added. Before a triplet can be chosen in the assign-ment, it will be checked whether the triplet will violate theconstraints by comparing the related variables with the con-straint sets (6)–(9). If an SWT triplet is feasible, it will beadded into an array of feasible triplets (denoted by AT); oth-erwise, it will be removed. A complete decision scheme willbe constructed in an incremental way, that is to say, eachtime only one triplet will be added into the auxiliary decisionmatrix according to the MMR-based rules presented in thefollowing.

C. Rules for S-WTA Problem Solver

The use of domain knowledge can reduce the complexity ofproblems to be solved, which is one of the main reasons for thepopularity of heuristics [11], [14], [32]. Here we try to makethe best of potential knowledge contained in the structure andparameters of S-WTA problems. To this end, the concept of“marginal return” is borrowed from economics. The marginalreturn is defined as the additional yield resulting from one unitincrease in the use of variable inputs while other inputs areheld constant. In this paper, the marginal return of an SWTtriplet refers to the additional increase of the objective valueresulting from adding only one feasible triplet into the decision

Page 6: An Efficient Marginal-Return-Based Constructive Heuristic

XIN et al.: EFFICIENT MRBCH TO SOLVE S-WTA PROBLEM 2541

matrix X. Moreover, we employ a marginal return matrix � =[δijk]S×W×T to record the marginal return of all triplets. Theelement δijk represents the marginal return when an additionaltriplet (i, j, k) is added to the assignment scheme, and it canbe computed as follows:

u1(k) = vk[1− Pmis(k)][1− Qmis(k)]

u2(i, j, k) = vk

[1− Pmis(k)(1− pik)

1−yik]

×[1− Qmis(k)(1− qjk)

1−zjk]

δijk = u2(i, j, k)− u1(k) (18)

where u1(k) and u2(i, j, k) denote the expected values of theeliminated threat from the kth target, before and after addingthe triplet (i, j, k) into the current assignment scheme, respec-tively. Define U1 = [u1(k)]1×T and U2 = [u2(i, j, k)]S×W×T .Pmis(k) stands for the probability that the kth target is missedby the sensors assigned to it. Qmis(k) denotes the probabilitythat the kth target is missed by the weapons assigned to thekth target. Based on the above concept, we adopt the followingtwo rules to build the proposed heuristic algorithm.

1) The more marginal return an SWT triplet can bring,the higher priority it should be given in the process ofassignment.

2) If any available triplet is confirmed to be adopted, thethreat value of the corresponding target will be reduced.

The first rule indicates that the feasible SWT triplet withthe MMR should be added into the current decision schemeX. According to the second rule, after adding one triplet intothe decision scheme, the value of the related target should beupdated.

The MMR is given by

δi∗j∗k∗ = max(i,j,k)∈CT

{δijk

}(19)

where CT is the set of all feasible SWT triplets with respectto the current assignment scheme and (i∗, j∗, k∗) is the SWTtriplet corresponding to the MMR.

Once the triplet (i∗, j∗, k∗) is assigned, Pmis(k∗), andQmis(k∗) will be updated as follows:

Pmis(k∗

) =∏

i∈Sk∗(1− pik∗)

Qmis(k∗

) =∏

j∈Wk∗

(1− qjk∗

). (20)

Obviously, Pmis(k) and Qmis(k) will decrease as more andmore sensors and weapons are assigned, which follows thesecond rule.

D. Processing Procedure

The processing procedure of MRBCH is shown in Fig. 2. Ineach iteration, as for each triplet available, it will be checkedwhether the triplet will violate any constraint after adding itinto the decision scheme. Those triplets which violate any con-straint will be skipped, and all feasible triplets are recordedin the matrix ATl×3. Note that the initial value of l is equalto S · W · T , and it will be renewed according to the num-ber of available triplets during the process of MRBCH. Then,

Algorithm 1: MRBCHInput: S, W, T , V, P, Q, mk, nkOutput: X, Y, Z, J(Y, Z)Initialize elements of X, Y, Z, AT, NS, NW, NTS, NTW to zeroL← S×W × T ; l← L;for k = 1 to T do

Pmis(k)← 1; Qmis(k)← 1;end form = 1;for i = 1 to S do

for j = 1 to W dofor k = 1 to T do

AT(m, :) := (i, j, k); m := m+ 1;end for

end forend forrepeat

for p = 1 to l doif the triplet (ip, jp, kp) violates any constraint then

Delete this triplet from AT;end if

end forl := length(AT);for k = 1 to T do

u1(k) := vk[1− Pmis(k)][1− Qmis(k)];end forfor p = 1 to l do

(ip, jp, kp)← the pth triplet in AT;

u2(ip, jp, kp) := vkp [1− Pmis(kp)(1− pip,kp)(1−yip,kp )][1−

Qmis(kp)(1− qjp,kp)(1−zjp,kp )];

δipjpkp := u2(ip, jp, kp)− u1(kp);end forp∗ := arg maxp δipjpkp ;(i∗, j∗, k∗)← the p∗th triplet in AT;xi∗j∗k∗ ← 1;Pmis(k∗)← Pmis(k∗)× (1− pi∗k∗)1−yi∗k∗ ;Qmis(k∗)← Qmis(k∗)× (1− qj∗k∗)

1−zj∗k∗ ;yi∗k∗ ← 1; zj∗k∗ ← 1;Delete the p∗th triplet in AT;Update NS, NW, NTS, NTW;l := l− 1;

until l ≤ 0Calculate J(Y, Z)return X, Y, Z, J(Y, Z)

U1, U2, and � will be updated, and the triplet (i∗, j∗, k∗)with the MMR is picked out from all available triplets. Andit will be added into the decision scheme, resulting in theupdate of relevant variables such as Y, Z, NS, NW, NTS,NTW, V, Pmis(k∗), and Qmis(k∗). The algorithm terminateswhen no triplets are available for assignment. It is clear thatthe larger the marginal return of an SWT triplet is, the earlierthe assignment is performed, which is consistent with the firstassignment rule presented in the previous section.

Besides, given input variables S, W, T , V, P, Q, mk, andnk, the MRBCH is described as Algorithm 1.

E. Computational Complexity Analysis

The time complexity of the proposed algorithm can beapproximated by that of the lookup operations embedded inthe heuristic. A desirable lookup algorithm has the complex-ity of O(l), where l is the size of the array for lookup. Beinginitialized to S · W · T , the size l is on the decrease duringthe assignment process of the proposed algorithm. Each timeone triplet is assigned, at least (S + W) × T related tripletswill become infeasible due to the constraints in (6) and (7).

Page 7: An Efficient Marginal-Return-Based Constructive Heuristic

2542 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

Fig. 2. Flowchart of the MRBCH.

Thus, the lookup operation runs at most (S · W/S + W)

times (see Algorithm 1). Therefore, the worst-case time com-plexity of the heuristic can be approximately expressed byO(min(S, W) · S ·W · T).

F. Discussion on Parallel Implementation

Parallel computing is undoubtedly one of the keys to speedup the computation of an algorithm. Because of the depen-dence among the operations in the process of constructinga solution, the proposed heuristic cannot support code paral-lelization in nature. However, it can be speeded up by parallelimplementation of constraint handling and the update of theMMR.

Assume that the time cost of a basic operation (e.g., addi-tion, subtraction, multiplication, division, and others) is t0. Asfor the process of constraint handling for an SWT triplet, itincludes S · T , W · T , S · T , and W · T additions for (6)–(9),respectively. The continuous addition in constraint handlingcan be parallelized, and it includes log T , log T , log S, and

log W operations for the constraint handling of (6)–(9), respec-tively. For each sensor, the process of constraint handling of (6)is independent, so they can be carried out simultaneously.Likewise, the processes of constraint handling of (7)–(9) foreach weapon and each target can be parallelized as well. Asshown in Fig. 2, there are l triplets that need to be checkedwhether they violate any constraint in each iteration of themain loop. Also, the processes of constraint handling of the ltriplets are independent. There are S ·W ·T triplets in total thatneed to be checked. Hence, the total time cost of constrainthandling is approximately 2S ·W ·T2 · (S+W) · t0 when carriedout step by step. Besides, at most (S·W/S+W) times iterationswill be carried out (see Section III-E) and they cannot be paral-lelized because of the dependency among them. Thus, the totaltime cost of constraint handling when using parallel computingis approximately (S ·W/S+W) log(max{S, W, T}) · t0.

Similarly, the process of the update of MMR can be speededup as well. In each iteration, there are T , S · W · T , andS ·W ·T operations in the updates of the matrixes U1, U2, and�, respectively. Besides, the process of finding the maximum

Page 8: An Efficient Marginal-Return-Based Constructive Heuristic

XIN et al.: EFFICIENT MRBCH TO SOLVE S-WTA PROBLEM 2543

element (i.e., MMR) in � needs S · W · T operations. Thus,the total time cost of the update of MMR in a sequential wayis approximately (S ·W/S+W)(3S ·W · T + T) · t0. Since theupdates of U1 and U2 are independent, they can be performedsimultaneously. Furthermore, all the elements in the U1, U2,and � can be updated in parallel. In addition, the process offinding the maximum element in � include S ·W · T continu-ous comparison operations. The continuous comparison can beparallelized as well, and it needs log(S·W ·T) operations. Sincewe can update � only when U1 and U2 have been updated,the total time cost of the update of MMR is approximately(S · W/S + W)[2 + log(S · W · T)] · t0 when using parallelcomputing.

Give the above, because of the dependency between theprocesses of constraint handling and the update of MMR, thetheoretical speedup ratio of MRBCH is approximately

[2SWT2(S+W)+ SW

S+W (3SWT + T)]· t0

SWS+W

[log(max{S, W, T})+ 2+ log(SWT)

] · t0= 2T2(S+W)2 + 3SWT + T

log(SWT)+ log(max{S, W, T})+ 2. (21)

IV. COMPUTATIONAL EXPERIMENTS AND ANALYSIS

This section is devoted to the performance investigationof the proposed algorithm. At first, we present an S-WTAtest-case generator which can produce instances of differ-ent scales. Then, we briefly introduce two algorithms forcomparison. Next, we illustrate the parameter settings of theempirical studies. At last, we present the comparative resultsand make an analysis. All experiments were carried out inMATLAB_R2016b environment on a PC with Intel XeonE5 CPU 2.60GHz and 64 GB internal memory, rather thancustomized parallel computing platforms.

A. Test-Case Generator

As a preparation for testing, the performance of differentalgorithms in solving S-WTA problems of different scales,we first developed a test case generator. Given S, W, andT , the generator will provide the essential parameters foran S-WTA instance which include V, P, Q, li, mk, and nk

(k = 1, 2, . . . , T).1) Generation of V: The threat values of the targets are

generated as uniformly distributed random integers in therange from 1 to 1000.

2) Generation of P:

pik := pL + (pH − pL) · rand (22)

where i = 1, 2, . . . , S, k = 1, 2, . . . , T; pL and pH arepredefined constants with 0 < pL < pH < 1 which reflect thelower and upper limits of sensors’ performance, respectively.

3) Generation of Q:

qjk := qL + (qH − qL) · rand (23)

where j = 1, 2, . . . , W, k = 1, 2, . . . , T; qL and qH arepredefined constants with 0 < qL < qH < 1 which reflectthe lower and upper bounds of the probability that a weapon

kills the engaged target under effective guidance, respectively.In the sequel, the parameters are set as pL = 0.50, pH = 0.95,qL = 0.72, and qH = 0.96.

4) Generation of li. mk, and nk: We categorize the gener-ation of mk and nk into the following two cases.

Case 1: S, W, and T take the same value, and each targetis assigned to exactly one sensor and one weapon

mk = 1, nk = 1, for k = 1, 2, . . . , T. (24)

Case 2: S, W, and T may take different values. The numberof sensors or weapons which are allowed to be assigned toone target is set to 1, 2, or 3 according to the threat value ofthe target

mk =

⎧⎪⎨

⎪⎩

1, if 0 < vk ≤ 500

2, if 500 < vk ≤ 900

3, if 900 < vk ≤ 1000(25)

nk =

⎧⎪⎨

⎪⎩

1, if 0 < vk ≤ 500

2, if 500 < vk ≤ 900

3, if 900 < vk ≤ 1000.

(26)

Remark 4: As a special case of the general S-WTA problemdescribed in Section II, case 1 caters to the simplified one thatBogdanowicz and Coleman [4] put forward. case 2 representsthe general S-WTA problems.

B. Algorithms for Comparison

1) Swt_opt and Its Extension: The Swt_opt algorithmproposed by Bogdanowicz et al. [4], [28], [37] represents thestate-of-the-art for the simplified S-WTA problems in whicheach sensor–weapon pair can be assigned to at most one targetand each target can be engaged by at most one sensor–weaponpair. If the benefit matrix describing the benefit of assigningeach sensor–weapon pair to each target is given, Swp_opt canfind an assignment that maximizes the total benefit. As theoptimality of the solution in the special case can be guaranteed,Swt_opt is also employed in our computational experimentsfor performance comparison. We take the expected value ofthe eliminated threat from the target involved in the SWTtriplet (i, j, k) to approximate the benefit of the correspondingassignment, which is given as follows:

bki,j = vkpijqjk

∀i = 1, 2, . . . , S; ∀j = 1, 2, . . . , W; ∀k = 1, 2, . . . , T. (27)

Furthermore, we also adopt the extension of Swt_optproposed by Bogdanowicz and Coleman [37] in their workto solve the general S-WTA problems in which the numbersof targets, weapons, and sensors may differ.

2) Random Sampling Method: For comparison, we adopt arandom sampling (RS) method to solve the S-WTA problem.We number all the SWT triplets 1, 2, . . . , L (L = S×W × T).Each sampling here is achieved by a pure random permuta-tion of the integers from 1 to L. Note that a permutation herestands for the sequence of all triplets in the construction pro-cess of a complete solution to the S-WTA problem. Akin toMRBCH, for each permutation, initially an empty auxiliarydecision matrix will be generated, and then all triplets in the

Page 9: An Efficient Marginal-Return-Based Constructive Heuristic

2544 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

order of the permutation will be checked one by one by usingthe constraint handling approach presented in Section III-B.If no constraint is violated, the corresponding triplet will beadded into the matrix. In this way, a complete solution canbe constructed incrementally with regards to any permutation.In this method, many random permutations will be generatedindependently and decoded into S-WTA solutions within therunning time permitted for each instance, and the best oneamong these solutions is selected as the final decision scheme.

Remark 5: In fact, the constraint handling procedure playsthe role of a decoder. It has the virtue of guaranteeing thefeasibility of the solutions. Obviously, RS is a statisticalmethod without the use of information accumulated duringits sampling process. In the family of sampling-based meth-ods, including various iterative search methods, RS is theeasiest one, but it is insensitive to local optima due to theindependence among samplings.

C. Experiment Setup and Parameter Setting

In the computational experiments, 25 random instances aregenerated for performance validation and comparison, includ-ing two instances from case 1 and 23 instances from case 2(see Section IV-A). The settings of the three basic parametersS, W, and T for these instances are shown as follows:

S5W4T3(No. 1), S4W5T7(No. 2)

S6W6T6(No. 3)†,1, S9W8T8(No. 4)

S10W7T12(No. 5), S13W11T9(No. 6)

S17W14T12(No. 7), S19W14T15(No. 8)

S15W18T21(No. 9), S19W23T16(No 10)

S20W22T15(No. 11), S21W18T20(No. 12)

S20W28T23(No. 13), S26W23T17(No. 14)

S24W22T21(No. 15), S28W23T19(No. 16)

S23W19T30(No. 17), S25W20T28(No. 18)

S40W40T40(No. 19)†, S62W50T30(No. 20)

S67W72T54(No. 21), S85W72T67(No. 22)

S100W90T80(No. 23), S150W130T120(No. 24)

S180W170T160(No. 25).

Detailed data of these instances and the code of all thealgorithms for comparison are available for download.2

Since the decision schemes produced by RS may be insta-ble due to its stochastic nature, RS will run 30 times for eachinstance and the results will be statistically analyzed. AlthoughMBRCH and Swt_opt are deterministic algorithms whose out-put will keep constant for the same instance, both of themwill also run 30 times to collect the data about their runtimeperformance. As for Swt_opt, the parameter C which is thesmallest integer as a multiplier to enlarge the benefit valueinto an integer is set to 100, and the minimum bidding incre-ment parameter ε is set to ε < 1/n, e.g., ε can be set to

1The mark † indicates that the instance is generated from case 1 (seeSection IV-A).

2[Online]. Available: http://pris.bit.edu.cn/home/people/OtherStaff/xinbin.htm

TABLE IICOMPARATIVE RESULTS WITH RESPECT TO COMPUTATIONAL TIME (S).

“n.a.” INDICATES THAT THE RESULT IS NOT AVAILABLE FOR THE

CORRESPONDING INSTANCE WITHIN THE ACCEPTABLE

TIME (1800 s IN THIS PAPER)

0.0099 if n = 100 (see [28], [37] for more details about thetwo parameters). For fair comparison, RS will terminate whenit runs out of the permitted time which is much more thanthat of MRBCH for the same instance. At least 1000 RSsare executed for each instance to take full advantage of thetime and find the best solution it can reach. For each instance,however, the allowable runtime for any algorithm is setto 1800 s.

D. Results and Analysis

Experimental results are presented in Fig. 3 and Table II.Table II presents the statistical results about the running timeof each algorithm regarding each instance. Fig. 3 provides thecomparison results about the normalized objective values ofthe solutions obtained by the three algorithms, respectively.For each instance, we normalized the objective values fromzero to one (i.e., 100%) based on the minimum and maximumobjective values of the solutions obtained by all of the threealgorithms

f ′ = f − fmin

fmax − fmin(28)

where f ′, f , fmax, and fmin are the normalized value, the orig-inal value, the obtained maximum value, and the obtainedminimum value for certain instance, respectively.

1) Comparison in Terms of Solution Quality: As shownin Fig. 3, MBRCH performs remarkably better than RS andSwt_opt for the majority of the S-WTA test cases. To be spe-cific, the mean objective value obtained by MRBCH is on

Page 10: An Efficient Marginal-Return-Based Constructive Heuristic

XIN et al.: EFFICIENT MRBCH TO SOLVE S-WTA PROBLEM 2545

Fig. 3. Comparison results of MRBCH, Swt_opt, and RS with respect to the normalized objective value.

average 37.10% better than that by Swt_opt, and on aver-age 21.57% better than that of RS. Furthermore, regardingthe maximum objective value, the advantage of MRBCHover RS is 18.33%. It should be noted that better resultsare achieved with less computation cost (see Table II). Itis apparent that the quality of solutions generated by RS isunstable and decreases with problem size. Particularly, theperformance of RS is the best for instance No. 1, since thesize of the instance is very small and RS can cover all feasiblesolutions.

It is noteworthy that the performance of Swt_opt is the bestamong the three algorithms for the simplified S-WTA problemdescribed in [4] (see the two special instances No. 3 and No. 19in Fig. 3). However, MRBCH can provide high-quality solu-tions for such simplified problems with much less computationcost, and the objective value of these solutions is very closeto that of Swt_opt. In the remaining cases, Swt_opt cannotguarantee the quality of solutions. This is mainly because theobjective generally cannot be decomposed into the sum ofthe benefits of all triplets when more than one sensor (and/orweapon) can be assigned to one target.

2) Comparison in Terms of Runtime Performance: Asreflected in Table II, MRBCH can solve S-WTA problemsmuch more efficiently, especially in larger-sized cases, whichvalidates the rationality of the marginal-return-based ruleselucidated in Section II. Specifically, the computation timeof MRBCH averaged on all cases is about 185 times lessthan that of RS, and 650 times less than that of Swt_opt.In addition, MRBCH can solve small-sized and medium-sized S-WTA instances (No. 1–20) almost instantly whereL almost increases to 100 000. Even for larger-scale S-WTAinstances, e.g., instances No. 21–23, it can achieve efficientS-WTA within only a few seconds. Note that all the tests were

Fig. 4. Runtime performance of different algorithms.

performed on a PC without parallel speedup. And the use ofparallel computing as described in Section III-F will result ingreater time efficiency of the proposed heuristic.

In contrast, Swt_opt cannot generate solutions within thepermitted computing time. In fact, when more than one sensor(and/or weapon) can be assigned to one target, the extensionrules proposed in [37] will generate more elements of the samevalue (or zero elements) in the benefit matrix, which resultsin an extraordinarily large amount of auctions (iterations) inSwt_opt. The same situation will also appear when the num-bers of the sensors, the weapons and the targets are seriouslyunmatched.

In terms of scalability, MRBCH can solve larger-sizedS-WTA problems with the characteristic parameter L greaterthan 4 800 000 within tens of seconds (see the result aboutinstance No. 25 in Table II). Besides, as shown in Fig. 4, thetime complexity of MRBCH is significantly lower than thatof Swt_opt and RS.

Page 11: An Efficient Marginal-Return-Based Constructive Heuristic

2546 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. 49, NO. 12, DECEMBER 2019

As can be observed from Fig. 3, the solutions generated byMRBCH regarding instances No. 9, 12, 13, 15, 17, and 18 haveobviously higher quality in contrast to Swt_opt and RS. This isbecause these instances correspond to severe defense scenarioswhere sensor and weapon resources are insufficient. In suchsituations, the generated decisions under high defense pressurehave destroyed as many incoming targets as possible. Notethat, since the sizes of the instances No. 1 and No. 2 are verysmall, RS may generate a better solution than MRBCH with afinite number of samplings. Even so, the average performanceof RS in solving S-WTA problems is obvious inferior to thatof MRBCH.

V. CONCLUSION

The coallocation of sensors and weapons to targets inNCW is a promising way to enhance the operational effec-tiveness. The resulting S-WTA problem is formulated as aconstrained 0-1 programming problem. The problem-specificknowledge in the form of marginal return-based simple rulesis utilized to build an efficient constructive heuristic. Theproposed heuristic, namely MRBCH, can solve S-WTA prob-lems very efficiently in terms of solution quality and runtimeperformance. MBRCH performs remarkably better than RSand Swt_opt for the majority of the S-WTA test cases. To bespecific, the mean objective value obtained by MRBCH is onaverage 37.10% better than that by Swt_opt, and on average21.57% better than that of RS. Particularly, RS works bestunder the scenario where the size of the instance is very small(e.g., instance No. 1 involving five sensors, four weapons,and three targets). Swt_opt is the best among the three algo-rithms for the simplified S-WTA problem described in [4]. Thevirtue of MRBCH is also reflected by its lower computationalcomplexity as larger S-WTA instances (e.g., those involvingmore than 100 sensors, 100 weapons, and 100 targets) canbe solved in a short time. As for small and medium-sizedinstances, MRBCH can generate high-quality solutions almostinstantly (e.g., those instances involving less than 85 sensors,72 weapons, and 67 targets can be solved in less than 2 s). Inaddition, it is expected that MRBCH can be speeded up withthe support of parallel hardware.

Although the sampling-based RS on the whole does notexhibit excellent performance, it is possible to design effi-cient sampling-based optimizers like memetic algorithms toexplore the solution space of S-WTA problems more effi-ciently [38]–[40]. MRBCH can also be used to provide initialsolutions to these optimizers to accelerate their convergencetoward global optima or near-optimal solutions [41]. However,it should be noted that the applicability of these advanced opti-mizers is obviously limited by their real-time performance. Inthe pursuit of their merits in improving solution quality, pow-erful computing platforms, especially those which facilitateparallel computing, are usually indispensable for supportingsufficient iterations and samplings in the solution spacewith the strict time constraints of practical S-WTA decision-making. Furthermore, how to mine potential knowledge andincorporate the knowledge into the design of problem-solversalso deserves further research in the future.

REFERENCES

[1] D. S. Alberts, J. J. Garstka, and F. P. Stein, Network CentricWarfare: Developing and Leveraging Information Superiority, documentADA406255, Defense Tech. Inf. Center, Fort Belvoir, VA, USA, 2000.

[2] S. Paradis, A. Benaskeur, M. Oxenham, and P. Cutler, “Threat evalu-ation and weapons allocation in network-centric warfare,” in Proc. 8thInt. Conf. Inf. Fusion, vol. 2. Philadelphia, PA, USA, Jun. 2005,pp. 1078–1085.

[3] F. E. George, M. Jain, K. Jakubowski, and K. Whittaker, “Conflict res-olution for shared resources between network managers,” in Proc. Mil.Commun. Conf., San Jose, CA, USA, Oct./Nov. 2010, pp. 1323–1328.

[4] Z. R. Bogdanowicz and N. P. Coleman, “Sensor-target and weapon-target pairings based on auction algorithm,” in Proc. 11th WSEAS Int.Conf. Appl. Math., Dallas, TX, USA, Mar. 2007, pp. 92–96.

[5] F. Bolderheij and P. V. Genderen, “Mission driven sensor management,”in Proc. 7th Int. Conf. Inf. Fusion, Stockholm, Sweden, Jun. 2004,pp. 799–804.

[6] R. Tharmarasa, T. Kirubarajan, J. M. Peng, and T. Lang, “Optimization-based dynamic sensor management for distributed multitarget track-ing,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 39, no. 5,pp. 534–546, Sep. 2009.

[7] G. Asnis and S. Blackman, “Optimal allocation of multi-platform sen-sor resources for multiple target tracking,” in Proc. 14th Int. Conf. Inf.Fusion, Chicago, IL, USA, Jul. 2011, pp. 1–8.

[8] Y. C. Wang, G. L. Shan, and J. Tong, “Solving sensor-target assign-ment problem based on cooperative memetic PSO algorithm,” Syst. Eng.Electron., vol. 35, no. 5, pp. 1000–1007, 2013.

[9] S. K. Das, “Modeling intelligent decision-making command and controlagents: An application to air defense,” IEEE Intell. Syst., vol. 29, no. 5,pp. 22–29, Sep./Oct. 2014.

[10] C.-P. Chen et al., “A hybrid memetic framework for coverageoptimization in wireless sensor networks,” IEEE Trans. Cybern., vol. 45,no. 10, pp. 2309–2322, Oct. 2015.

[11] R. K. Ahuja, A. Kumar, K. C. Jha, and J. B. Orlin, “Exact and heuris-tic algorithms for the weapon-target assignment problem,” Oper. Res.,vol. 55, no. 6, pp. 1136–1146, Nov./Dec. 2007.

[12] A. M. Madni and M. Andrecut, “Efficient heuristic approach to theweapon-target assignment problem,” J. Aerosp. Comput. Inf. Commun.,vol. 6, no. 6, pp. 405–414, 2009.

[13] F. Johansson and G. Falkman, “Real-time allocation of firing unitsto hostile targets,” J. Adv. Inf. Fusion, vol. 6, no. 2, pp. 187–199,2011.

[14] B. Xin, J. Chen, Z. H. Peng, L. H. Dou, and J. Zhang, “An efficient rule-based constructive heuristic to solve dynamic weapon-target assignmentproblem,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 41,no. 3, pp. 598–606, May 2011.

[15] B. Xin, J. Chen, J. Zhang, L. H. Dou, and Z. H. Peng, “Efficient decisionmakings for dynamic weapon-target assignment by virtual permutationand tabu search heuristics,” IEEE Trans. Syst., Man, Cybern. C, Appl.Rev., vol. 40, no. 6, pp. 649–662, Nov. 2010.

[16] B. Xin and J. Chen, “An estimation of distribution algorithm with effi-cient constructive repair/improvement operator for the dynamic weapon-target assignment,” in Proc. 31st Chin. Control Conf., Hefei, China,Jul. 2012, pp. 2346–2351.

[17] Z. R. Bogdanowicz, “Advanced input generating algorithm for effect-based weapon–target pairing optimization,” IEEE Trans. Syst., Man,Cybern. A, Syst., Humans, vol. 42, no. 1, pp. 276–280, Jan. 2012.

[18] Z. R. Bogdanowicz, A. Tolano, K. Patel, and N. P. Coleman,“Optimization of weapon–target pairings based on kill probabilities,”IEEE Trans. Cybern., vol. 43, no. 6, pp. 1835–1844, Dec. 2013.

[19] Z. R. Bogdanowicz and K. Patel, “Quick collateral damage estimationbased on weapons assigned to targets,” IEEE Trans. Syst., Man, Cybern.,Syst., vol. 45, no. 5, pp. 762–769, May 2015.

[20] L. Juan, C. Jie, and X. Bin, “Efficiently solving multi-objective dynamicweapon-target assignment problems by NSGA-II,” in Proc. 34th Chin.Control Conf., Hangzhou, China, Jul. 2015, pp. 2556–2561.

[21] M. E. Havens, “Dynamic allocation of fires and sensors,” Ph.D. disser-tation, Naval Postgraduate School, Monterey, CA, USA, 2002.

[22] S. A. Harman et al., “Sensor network performance modeling for weaponlocating,” in Proc. Defense Security, 2004, pp. 446–457.

[23] H. Li, H.-Y. Wang, S.-Y. Sun, and Z.-M. Qiu, “Optimization match-ing of sensor and weapon system based on geometric programming,”in Proc. 3rd Int. Conf. Innov. Comput. Inf. Control, Dalian, China,Jun. 2008, p. 97.

Page 12: An Efficient Marginal-Return-Based Constructive Heuristic

XIN et al.: EFFICIENT MRBCH TO SOLVE S-WTA PROBLEM 2547

[24] Z. F. Li, X. M. Li, J. Yan, J. J. Dai, and F. Kong, “An anytime algo-rithm based on decentralized cooperative auction for dynamic joint firedistribution problem,” in Proc. IEEE Int. Conf. Mechatronics Autom.,Chengdu, China, Aug. 2012, pp. 2031–2036.

[25] K. L. Ezra, D. A. DeLaurentis, and L. Mockus, “Comparative solutionmethods for the integrated problem of sensors, weapons, and targets,”in Proc. AIAA Model. Simulat. Technol. Conf., Atlanta, GA, USA,Jun. 2014, pp. 2082–2094.

[26] W. Jian and C. Chen, “Sensor-weapon joint management based onimproved genetic algorithm,” in Proc. 34th Chin. Control Conf.,Hangzhou, China, Jul. 2015, pp. 2738–2742.

[27] Z. F. Li, X. M. Li, J. J. Dai, J. Z. Chen, and F. X. Zhang, “Sensor-weapon target assignment based on improved SWT-opt algorithm,” inProc. 2nd IEEE Int. Conf. Comput. Control Ind. Eng., vol. 2. Wuhan,China, Aug. 2011, pp. 25–28.

[28] Z. R. Bogdanowicz, “A new efficient algorithm for optimal assignmentof smart weapons to targets,” Comput. Math. Appl., vol. 58, no. 10,pp. 1965–1969, Nov. 2009.

[29] H. D. Chen, Z. Liu, Y. L. Sun, and Y. F. Li, “Particle swarm optimizationbased on genetic operators for sensor-weapon-target assignment,” inProc. 5th Int. Symp. Comput. Intell. Design, vol. 2. Hangzhou, China,Oct. 2012, pp. 170–173.

[30] S. Yu, T. Chai, and Y. Tang, “An effective heuristic rescheduling methodfor steelmaking and continuous casting production process with multire-fining modes,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 46, no. 12,pp. 1675–1688, Dec. 2016.

[31] X. Li, B. Shou, and D. Ralescu, “Train rescheduling with stochasticrecovery time: A new track-backup approach,” IEEE Trans. Syst., Man,Cybern., Syst., vol. 44, no. 9, pp. 1216–1233, Sep. 2014.

[32] W. Q. Zhao, Q. G. Meng, and P. W. H. Chung, “A heuristic distributedtask allocation method for multivehicle multitask problems and its appli-cation to search and rescue scenario,” IEEE Trans. Cybern., vol. 46,no. 4, pp. 902–915, Apr. 2016.

[33] R. A. Murphey, “Target-based weapon target assignment problems,” inNonlinear Assignment Problems: Algorithms and Applications, P. M.Pardalos and L. S. Pitsoulis, Eds. Boston, MA, USA: Springer, 2000,pp. 39–53.

[34] S. E. Kolitz, “Analysis of a maximum marginal return assignment algo-rithm,” in Proc. 27th IEEE Conf. Decis. Control, vol. 3. Austin, TX,USA, Dec. 1988, pp. 2431–2436.

[35] V. An, Z. Qu, and R. Roberts, “A rainbow coverage path planning for apatrolling mobile robot with circular sensing range,” IEEE Trans. Syst.,Man, Cybern., Syst., to be published, doi: 10.1109/TSMC.2017.2662623.

[36] S. P. Lloyd and H. S. Witsenhausen, “Weapons allocation is np-complete,” in Proc. Summer Conf. Simulate, Reno, NV, USA, 1986,pp. 1054–1058.

[37] Z. R. Bogdanowicz and N. P. Coleman, Advanced Algorithm for OptimalSensor-Target and Weapon-Target Pairings in Dynamic CollaborativeEngagement, document ADA505790, Defense Tech. Inf. Center, FortBelvoir, VA, USA, 2008.

[38] S.-Y. Wang and L. Wang, “An estimation of distribution algorithm-basedmemetic algorithm for the distributed assembly permutation flow-shopscheduling problem,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 46,no. 1, pp. 139–149, Jan. 2016.

[39] G. Zhang and Y. Li, “A memetic algorithm for global optimization ofmultimodal nonseparable problems,” IEEE Trans. Cybern., vol. 46, no. 6,pp. 1375–1387, Jun. 2016.

[40] J. Chen, B. Xin, Z. Peng, L. Dou, and J. Zhang, “Optimal con-traction theorem for exploration–exploitation tradeoff in search andoptimization,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans,vol. 39, no. 3, pp. 680–691, May 2009.

[41] A. S. Azad, M. M. Islam, and S. Chakraborty, “A heuristic initial-ized stochastic memetic algorithm for MDPVRP with interdependentdepot operations,” IEEE Trans. Cybern., vol. 47, no. 12, pp. 4302–4315,Dec. 2017.

[42] G. G. denBroeder, Jr., R. E. Ellison, and L. Emerling, “On optimumtarget assignments,” Oper. Res., vol. 7, no. 3, pp. 322–326, Jun. 1959.

Bin Xin (S’09–M’10) received the B.S. degree ininformation engineering and the Ph.D. degree incontrol science and engineering from the BeijingInstitute of Technology, Beijing, China, in 2004and 2012, respectively.

He was an Academic Visitor with the Decisionand Cognitive Sciences Research Centre,University of Manchester, Manchester, U.K.,from 2011 to 2012. He is currently an AssociateProfessor with the School of Automation, BeijingInstitute of Technology. His current research

interests include search and optimization, evolutionary computation, combi-natorial optimization, and multiagent systems.

Dr. Xin serves as an Associate Editor for the Journal of AdvancedComputational Intelligence and Intelligent Informatics.

Yipeng Wang received the B.S. degree in automa-tion from the Beijing Institute of Technology,Beijing, China, in 2016, where he is currentlypursuing the master’s degree with the School ofAutomation.

His current research interests include swarm intel-ligence and resource allocation.

Jie Chen (M’09–SM’12) received the B.S.,M.S., and Ph.D. degrees in control theory andcontrol engineering from the Beijing Instituteof Technology, Beijing, China, in 1986, 1996,and 2001, respectively.

From 1989 to 1990, he was a Visiting Scholar withthe California State University, Long Beach, CA,USA. From 1996 to 1997, he was a Research Fellowwith the School of E&E, University of Birmingham,Birmingham, U.K. He is currently a Professor ofcontrol science and engineering with the Beijing

Institute of Technology, where he is also the Head of the State Key Laboratoryof Intelligent Control and Decision of Complex Systems. He has coauthoredthree books and over 200 research papers. His current research interestsinclude intelligent control and decision in complex systems, multiagentsystems, and optimization methods.

Dr. Chen served as a Managing Editor for the Journal of Systems Scienceand Complexity from 2014 to 2017 and has been serving as an AssociateEditor for the IEEE TRANSACTIONS ON CYBERNETICS since 2016, theInternational Journal of Robust and Nonlinear Control since 2017, and manyother international journals.