n. basilico, n. gatti, f. amigoni dei, politecnico di milano leader-follower strategies for robotic...

16
N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology Nicola Basilico, Nicola Gatti, Francesco Amigoni {basilico,ngatti,amigoni}@elet.polimi.it

Upload: taylor-mackay

Post on 26-Mar-2015

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary

Topology

Nicola Basilico, Nicola Gatti, Francesco Amigoni{basilico,ngatti,amigoni}@elet.polimi.it

Page 2: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Mobile Robot Patrolling

• A robotic patrolling situation is characterized by one or more mobile robotic patrollers and by some targets to be patrolled (Agmon et al., 2008), (Amigoni et al., 2008), (Gatti, 2008), (Paruchuri et al., 2008)

• Usually, due to the characteristics of the setting, the patrollers cannot employ a deterministic strategy, otherwise the intruder will surely succeed in attacking a target

• Patrollers should adopt an unpredictable patrolling strategy, randomizing over the targets and trying to reduce the intrusion probability (Pita et al., 2008)

• The scientific challenge is the development of effective patrolling strategies, i.e., the optimal probabilities with which the patroller moves in the environment

Page 3: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Example: Deterministic Strategy

€€

Page 4: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Example: Non-Deterministic Strategy

€€

Page 5: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Patrolling with Adversaries

• Considering adversaries can provide the patrolling robots a larger expected utility

• (Agmon et al., 2008) considers non-explicit adversary models in perimeter settings: the optimal patroller’s strategy is its max-min strategy

• (Paruchuri et al., 2008) considers explicit adversary models in settings without topologies: the optimal patroller’s strategy is its leader-follower strategy (von Stengel and Zamir, 2004)

• Knowing the adversary preferences (also with uncertainty) provides the patrollers a larger expected utility than not knowing them

Page 6: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Assumptions

• Time is discretized in turns

• There is a single patrolling robot equipped with a sensor (e.g., a camera) able to detect intruders

• The environment is discretized in cells and its topology is represented by a directed graph

• The intruder cannot do anything else for some turns once it has attempted to enter a cell (this amounts to say that penetration takes some time)

Page 7: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

The Environment Model

• C is the set of n cells to be patrolled

• Matrix T(nxn) represents the topology, where ti,j = 1 means that cells i and j are adjacent (the patroller can go from i to j with one action) and ti,j = 0 means that they are not

• Matrix V(nxn) represents the patroller’s sensing capabilities, where vi,j = 1 if cell j can be sensed by the patroller from cell i and vi,j = 0 otherwise

• Each cell i requires the intruder di > 0 turns to enter

• Xi and Yi (with i {1, 2, …, n}∈ ) are the values to the patroller and to the intruder, respectively, when the intruder successfully strikes cell i

• X0 and Y0 are the values to the patroller and to the intruder, respectively, when the intruder is captured

Page 8: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Example: Environment

d = 6X=.8,Y=.4

d = 4X=.7,Y=.5

d = 5X=.8,Y=.4

X0=1Y0=-1

The shortest cycle is 12 cells long and then no deterministic strategy is effective

Page 9: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

The Game Model

• At each turn, a strategic-form game is repeated in which the players act simultaneously

• The patroller chooses the next cell to move to among those directly connected to its current cell: called i the current cell of the robot, its actions are move(j), such that ti,j = 1

• If the intruder has not previously attempted to enter any cell, chooses whether or not to enter a cell and, in the former case, what cell to enter: its actions are wait and enter(i)

• If, instead, the intruder has previously attempted to enter a cell i, it cannot take any action for di turns

Page 10: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Example

Page 11: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Markov Hypothesis

• Being the game an extensive-form game with infinite-horizon, its study is commonly conducted by introducing symmetries: the patroller’s strategy depends only on the last |H| actions (i.e., the last |H| visited cells)

• The patroller’s strategy is denoted by αH,j, that is the probability to make action move(j) after history H

• When |H| = 0, the patroller’s strategy does not depend on any last visited cell, i.e., αj

• When |H| = 1, the patroller’s strategy depends only on the last visited cell, i.e., αi,j

• The intruder’s action space can be reduced to:

• stay-out: the intruder never enters the environment

• enter-when(i,j): the intruder enters cell i when the patroller is in cell j

Page 12: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Leader-Follower Equilibrium

• In a leader-follower scenario, the follower chooses its strategy after having observed the leader’s strategy

• The patrolling scenario is a leader-follower scenario where the patroller is the leader and the intruder is follower

• A leader-follower equilibrium is the appropriate solution concept for a game with a leader-follower position: it assures the leader a non smaller expected utility than any Nash equilibrium (von Stengel and Zamir, 2004)

• The follower’s strategy is pure (von Stengel and Zamir, 2004)

Page 13: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Solving Algorithm

• Computing a leader-follower equilibrium is essentially a bilevel optimization problem (Conitzer and Sandholm, 2006)

• In our case the second optimization level is bilinear• Our algorithm develops in two steps• Step 1: stay-out as intruder’s best response

• Checking whether or not there exists a patroller’s strategy such that stay-out is a intruder’s best response is a bilinear feasibility problem

• Step 2: enter-when as intruder’s best response• For each possible enter-when(i,j) we compute the optimal

patroller strategy as a bilinear optimization problem• The leader-follower strategy is the strategy that assures the

largest expected utility to the patroller among all the ones computed above

Page 14: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Example: Solution

.65

.95

.53.35

.55

.35

.33

.45

.41

.05

.23

.59

1.00

.44

.48

.47.52

.351.00

.30

Page 15: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Action Space Reduction

• We developed two reduction algorithms

• We reduce the patroller’s action space

• We reduce the intruder’s action space

• Our algorithms allow one to save about 96%-97% computational time

• Example:

• Without reduction: 132 bilinear optimization problems

• With reduction: 10 bilinear optimization problems

€€

Page 16: N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology

N. Basilico, N. Gatti, F. AmigoniDEI, Politecnico di Milano

Conclusions and Future Works

• Achievements

• A satisfactory patrolling model that generalizes models from the state of the art

• An exact algorithm for leader-follower equilibrium computation

• Effective algorithms for removing dominated actions and improving computational efficiency

• Future Works

• Introducing uncertainty over intruder’s preferences and patroller’s sensing capabilities

• Developing an algorithm for computing the deterministic patrolling strategy in general settings

• Extending our framework to multiple patrollers