lifelong multi-agent path finding for online pickup and ... · lifelong multi-agent path finding...

58
Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig University of Southern California Tsinghua University May 11, 2017 AAMAS

Upload: others

Post on 26-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Lifelong Multi-Agent Path Findingfor Online Pickup and Delivery Tasks

Hang Ma Jiaoyang Li T. K. Satish Kumar Sven KoenigUniversity of Southern California

Tsinghua University

May 11, 2017AAMAS

Page 2: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Multi-Agent Path Finding

Find collision-free paths for all agents from their currentlocations to their predefined goal locations in a knownenvironment.

a1

a2

g1

g2

Page 3: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Time Step 0

a1

a2

g1

g2

Page 4: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Time Step 1

a1

a2

g1

g2

Page 5: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Time Step 2

a1a2

g2

Page 6: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Time Step 3

a1

a2

Page 7: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Motivated by Real-World Applications:

Automated aircraft-towing vehicles, warehouse robots, officerobots, and game characters in video games.

Page 8: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Amazon Warehouse Robots1

Tasks: Move inventory shelves from storage locations toinventory stations or vice versa.

Figure 3: A small region of a Kiva layout. The green cells represent pod storage locations, the orange ovals the robots (withpods not pictured), and the purple and pink regions the queues around the inventory stations.

Figure 2: A Kiva drive unit and storage pod.

used to move the inventory pods with the correct bins fromtheir storage locations to the inventory stations where a pickworker removes the desired products from the desired bin.Note that the pod has four faces, and the drive unit may needto rotate the pod in order to present the correct face. When apicker is done with a pod, the drive unit stores it in an emptystorage location.

Each station is equipped with a desktop computer thatcontrols pick lights, barcode scanners, and laser pointers thatare used to identify the pick and put locations. Because ev-ery product is scanned in and out of the system, overall pick-ing errors go down, which potentially eliminates the needfor post-picking quality control. In general, every station iscapable of being either a picking station or a replenishmentstation. In practice, pick stations will be located near out-bound conveyors, and replenishment stations will be locatednear pallet drop off points.

The power of the Kiva solution comes from the fact thatit allows every worker to have random access to any inven-tory in the warehouse. Moreover, inventory can be retrievedin parallel. When the picker is filling several boxes at thesame time, the parallel, random access ensures that she isnot waiting on pods to arrive. In fact, by keeping a smallqueue of work at the station, the Kiva system delivers a newpod face every six seconds, which sets a baseline pickingrate of 600 lines per hour.2 Peak rates can exceed 600 linesper hour when the operator can pick more than one item offa pod.3

For a large warehouse, the savings in personnel can besignificant. Consider, for example, what a Kiva implemen-tation of the book warehouse would involve. A busy book-seller may ship 100,000 boxes a day. With existing automa-tion, this level of output would employ perhaps 75 workers

2This statistic is based on single unit picks and has been repro-duced for extended periods in the Kiva test facility.

3This statistic was verified when a small Kiva demonstrationsystem was brought to a drugstore distribution center where opera-tors picked at nearly 700 lines per hour.

1755

Figure 5: The Kiva demonstration facility.

AcknowledgmentsBuilding a working MVS requires a core set of great me-chanical, electrical, and software engineers. It is yet an-other thing to turn it into a commercial product and managethe manufacture, assembly, and deployment of these sys-tems. We thank the world-class Kiva employees who havebreathed life into this vision.

ReferencesBoutilier, C.; Shoham, Y.; and Wellman, M. P. 1997. Eco-nomic principles of multi-agent systems. Artificial Intelli-gence 94(1):1–6.Butenko, S.; Murphey, R.; and Pardos, P. M., eds. 2003.Cooperative Control: Models, Applications and Algo-rithms. Springer.Gilmour, K. 2003. Amazon warehouse, amazon adventure.Internet Magazine.Hazard, C. J.; Wurman, P. R.; and D’Andrea, R. 2006.Alphabet soup: A testbed for studying resource allocationin multi-vehicle systems. In Proceedings of the 2006 AAAIWorkshop on Auction Mechanisms for Robot Coordination,23–30.Jennings, N. R., and Bussmann, S. 2003. Agent-based con-trol systems: Why are they suited to engineering complexsystems? IEEE Control Systems Magazine 61–73.Jennings, N. R. 1996. Coordination techniques for dis-tributed artificial intelligence. In O’Hare, G. M. P., andJennings, N. R., eds., Foundations of Distributed ArtificialIntelligence. Wiley. 187–210.

Konolige, K.; Fox, D.; Ortiz, C.; Agno, A.; Eriksen, M.;Limketkai, B.; Ko, J.; Morisset, B.; Schulz, D.; Stewart,B.; and Vincent, R. 2004. Centibots: Very large scale dis-tributed robotic teams. In Proceedings of the InternationalSymposium on Experimental Robotics.Lesser, V. R. 1999. Cooperative multiagent systems: Apersonal view of the state of the art. IEEE Transactions onKnowledge and Data Engineering 11(1):133–142.Malone, T. W.; Fikes, R. E.; Grant, K. R.; and Howard,M. T. 1988. Enterprise: A market-like task scheduler fordistributed computing environments. In Huberman, B. A.,ed., The Ecology of Computation. North Holland.Rosenschein, J. S., and Zlotkin, G. 1994. Rules of En-counter. Cambridge: The MIT Press.Simmons, R.; Smith, T.; Dias, M. B.; Goldberg, D.; Hersh-berger, D.; Stentz, A.; and Zlot, R. 2002. A layered archi-tecture for coordination of mobile robots. In Schultz, A.,and Parker, L., eds., Multi-Robot Systems: From Swarmsto Intelligent Automata. Kluwer.Wellman, M. P., and Wurman, P. R. 1998. Market-awareagents for a multiagent world. Robotics and AutonomousSystems 24:115–25.

1759

1P. R. Wurman, R. D’Andrea, and M. Mountz. “Coordinating Hundreds of Cooperative, Autonomous Vehicles inWarehouses”. In: AI Magazine 29.1 (2008), pp. 9–20.

Page 9: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Multi-Agent Pickup and Delivery (MAPD) Problem

I Existing research on multi-agent path finding — a“one-shot” version:One pre-determined task for each agent — navigates to itsgoal location.

I MAPD — a “lifelong” version of multi-agent path finding:I A task can enter the system at any time.I Agents have to constantly attend to a stream of new tasks.

Page 10: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

MAPD Algorithms

1. Decoupled Task Assignment and Path FindingI Token Passing (TP): Greedy task assignment and no task

reassignment.I Token Passing with Task Swaps (TPTS): Local task

reassignment between two agents.

2. Centralized Task Assignment and Path FindingCENTRAL

Roughly:I Effectiveness: TP < TPTS < CENTRALI Efficiency: CENTRAL < TPTS < TP

Page 11: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Tasks

Figure 3: A small region of a Kiva layout. The green cells represent pod storage locations, the orange ovals the robots (withpods not pictured), and the purple and pink regions the queues around the inventory stations.

Figure 2: A Kiva drive unit and storage pod.

used to move the inventory pods with the correct bins fromtheir storage locations to the inventory stations where a pickworker removes the desired products from the desired bin.Note that the pod has four faces, and the drive unit may needto rotate the pod in order to present the correct face. When apicker is done with a pod, the drive unit stores it in an emptystorage location.

Each station is equipped with a desktop computer thatcontrols pick lights, barcode scanners, and laser pointers thatare used to identify the pick and put locations. Because ev-ery product is scanned in and out of the system, overall pick-ing errors go down, which potentially eliminates the needfor post-picking quality control. In general, every station iscapable of being either a picking station or a replenishmentstation. In practice, pick stations will be located near out-bound conveyors, and replenishment stations will be locatednear pallet drop off points.

The power of the Kiva solution comes from the fact thatit allows every worker to have random access to any inven-tory in the warehouse. Moreover, inventory can be retrievedin parallel. When the picker is filling several boxes at thesame time, the parallel, random access ensures that she isnot waiting on pods to arrive. In fact, by keeping a smallqueue of work at the station, the Kiva system delivers a newpod face every six seconds, which sets a baseline pickingrate of 600 lines per hour.2 Peak rates can exceed 600 linesper hour when the operator can pick more than one item offa pod.3

For a large warehouse, the savings in personnel can besignificant. Consider, for example, what a Kiva implemen-tation of the book warehouse would involve. A busy book-seller may ship 100,000 boxes a day. With existing automa-tion, this level of output would employ perhaps 75 workers

2This statistic is based on single unit picks and has been repro-duced for extended periods in the Kiva test facility.

3This statistic was verified when a small Kiva demonstrationsystem was brought to a drugstore distribution center where opera-tors picked at nearly 700 lines per hour.

1755

Figure 5: The Kiva demonstration facility.

AcknowledgmentsBuilding a working MVS requires a core set of great me-chanical, electrical, and software engineers. It is yet an-other thing to turn it into a commercial product and managethe manufacture, assembly, and deployment of these sys-tems. We thank the world-class Kiva employees who havebreathed life into this vision.

ReferencesBoutilier, C.; Shoham, Y.; and Wellman, M. P. 1997. Eco-nomic principles of multi-agent systems. Artificial Intelli-gence 94(1):1–6.Butenko, S.; Murphey, R.; and Pardos, P. M., eds. 2003.Cooperative Control: Models, Applications and Algo-rithms. Springer.Gilmour, K. 2003. Amazon warehouse, amazon adventure.Internet Magazine.Hazard, C. J.; Wurman, P. R.; and D’Andrea, R. 2006.Alphabet soup: A testbed for studying resource allocationin multi-vehicle systems. In Proceedings of the 2006 AAAIWorkshop on Auction Mechanisms for Robot Coordination,23–30.Jennings, N. R., and Bussmann, S. 2003. Agent-based con-trol systems: Why are they suited to engineering complexsystems? IEEE Control Systems Magazine 61–73.Jennings, N. R. 1996. Coordination techniques for dis-tributed artificial intelligence. In O’Hare, G. M. P., andJennings, N. R., eds., Foundations of Distributed ArtificialIntelligence. Wiley. 187–210.

Konolige, K.; Fox, D.; Ortiz, C.; Agno, A.; Eriksen, M.;Limketkai, B.; Ko, J.; Morisset, B.; Schulz, D.; Stewart,B.; and Vincent, R. 2004. Centibots: Very large scale dis-tributed robotic teams. In Proceedings of the InternationalSymposium on Experimental Robotics.Lesser, V. R. 1999. Cooperative multiagent systems: Apersonal view of the state of the art. IEEE Transactions onKnowledge and Data Engineering 11(1):133–142.Malone, T. W.; Fikes, R. E.; Grant, K. R.; and Howard,M. T. 1988. Enterprise: A market-like task scheduler fordistributed computing environments. In Huberman, B. A.,ed., The Ecology of Computation. North Holland.Rosenschein, J. S., and Zlotkin, G. 1994. Rules of En-counter. Cambridge: The MIT Press.Simmons, R.; Smith, T.; Dias, M. B.; Goldberg, D.; Hersh-berger, D.; Stentz, A.; and Zlot, R. 2002. A layered archi-tecture for coordination of mobile robots. In Schultz, A.,and Parker, L., eds., Multi-Robot Systems: From Swarmsto Intelligent Automata. Kluwer.Wellman, M. P., and Wurman, P. R. 1998. Market-awareagents for a multiagent world. Robotics and AutonomousSystems 24:115–25.

1759

g1

s2

g2

s1

Page 12: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Executing Task

In order to execute a task, the agent has to move from itscurrent location via the pickup location to the delivery location:

1. When the agent reaches the pickup location, it starts toexecute the task.

2. When it reaches the delivery location, it finishes the task.

g1

s1

Page 13: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Free Agents

Free Agents:

g1

s1

a2

a1

Page 14: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Occupied Agents

Occupied Agents:

g1

s1a2

a1

Page 15: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Assignment of Agents to Tasks

I A free agent can be assigned to any unexecuted task.

I An occupied agent has to finish executing its current task.

Page 16: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Objective of MAPD

Finish executing each task as quickly as possible.

Page 17: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Effectiveness of a MAPD algorithm

Service time: the average number of timesteps needed tofinish executing each task after it enters the system.An algorithm solves a MAPD instance ⇐⇒ Service time of alltasks is bounded.

Page 18: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Service time: 7+72 = 7

g1

s2

g2

s1

a2

a1

Page 19: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Solvability

Not every MAPD instance is solvable.

s1 a1 a2 g1

Page 20: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Well-Formed MAPD Instances

Being well-formed (based on [M. Cap et al 2015]2): a sufficientcondition that makes MAPD instances solvable.Intuition: agents should only be allowed to rest (that is, stayforever) in locations, called parking locations, where theycannot block other agents.

2M. Cap, J. Vokrınek, and A. Kleiner. “Complete Decentralized Method for On-Line Multi-Robot TrajectoryPlanning in Well-formed Infrastructures”. In: International Conference on Automated Planning and Scheduling.2015, pp. 324–332.

Page 21: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Parking Locations

I Task Parking Locations: all pickup and delivery locationsof tasks(storage locations, inventory stations, etc.)

I Non-task Parking Locations:I All initial locations of agentsI Additional designated parking locations

g1

s2

g2

s1

a1 a2

a3

Page 22: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Well-Formed MAPD Instances

1. # tasks is finite;2. # non-task parking locations ≥ # agents;3. For any two parking locations, there exists a path between

them that traverses no other parking locations.

a1 a2 a1 a2 a1 a2

Page 23: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

MAPD Algorithms

We present

1. Two Decoupled Algorithms:complete for well-formed MAPD instances (solve allwell-formed instances)

I Token Passing (TP)I Token Passing with Task Swaps (TPTS)

2. One Centralized Algorithm:CENTRAL

Page 24: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

A Running Example

Unexecuted Tasks: task1, task2.Agent a1 and agent a2 are resting.Agent a3 is assigned to task1 and on the way to the pickuplocation s1.

a1

a2a3

s1

g1

s2

g2

Page 25: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Token Passing (TP)

Based on an idea similar to Cooperative A*3:

I Token: a synchronized shared block of memory thatcontains the current paths of all agents, set of unexecutedtask, and agent assignments.

I Only one agent has access to the token at each time.I Each agent assigns itself a task, plan its path, and passes

the token to the next agent.

3D. Silver. “Cooperative Pathfinding”. In: Artificial Intelligence and Interactive Digital Entertainment. 2005,pp. 117–122.

Page 26: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Key Idea

I A task can only be assigned once.I Once an agent is assigned to a task, it cannot be assigned

to other tasks until it finishes the task.

Page 27: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Running Example

Task Available for Assignment: task2.Agent a1 and agent a2 request for token.

a1

a2a3

s1

g1

s2

g2

Page 28: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Agent a1’s Turn

Agent a1 Has Token

1. it cannot assign itself to any task because agent a2 rests ing2, the only task available to it;

2. it has to rest in a parking location that will not create anydeadlock;

3. it can continue to rest in its current location.

s1

g1

s2

g2

Page 29: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Agent a2’s Turn

Agent a2 Has Token

1. it assigns itself to task2;2. task2 is no longer available to other agents;3. it plans a cost-minimal collision-free path to execute task2.

s1

g1

s2

g2

Page 30: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s1

g1

s2

g2

Page 31: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s1

g1

s2

g2

Page 32: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s1

g1

s2

g2

Page 33: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s1

g1

s2

g2

Page 34: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s1

g1

s2

g2

Page 35: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s1

g1

s2

g2

Page 36: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s2

g2

Page 37: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

s2

g2

Page 38: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Animation

Page 39: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TP: Completeness

TheoremAll well-formed MAPD instances are solvable, and TP solvesthem.

Page 40: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Improving the Effectiveness of TP

TP is simple but can be made more effective:

I A task with an assigned agent can be assigned a newagent (as long as the task has not been executed).

Page 41: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Token Passing with Task Swaps (TPTS)

An agent is allowed to grab a task from another agent if it canfinish the task earlier.

Page 42: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TPTS: Running Example

Tasks Available for Assignment: task1, task2.Agent a1 and agent a2 request for token.

a1

a2a3

s1

g1

s2

g2

Page 43: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TPTS: Agent a1’s Turn

Agent a1 has token.Agent a1 grabs task1 from agent a3.

s1

g1

s2

g2

Page 44: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TPTS: Agent a3 Making Decisions

Agent a3 has token.Agent a3 moves to a parking location that will not create anydeadlock in the future.

s1

g1

s2

g2

Page 45: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TPTS: Agent a2’s Turn

Agent a2 has token.Agent a2 grabs task1 from agent a1.

s1

g1

s2

g2

Page 46: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TPTS: Agent a1 Making Decisions

Agent a1 has token.Agent a1 assigns itself to task2.

s1

g1

s2

g2

Page 47: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

TPTS: Completeness

TheoremTPTS solves all well-formed MAPD instances.

Page 48: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Centralized MAPD Algorithm: CENTRAL

CENTRAL assigns agents to tasks in a centralized way:1. assigns parking locations to all free agents using

Hungarian method;2. plans paths for all of them from their current locations to

their assigned parking locations by solving the resulting“one-shot” multi-agent path-finding problem.

Page 49: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

CENTRAL: Running Example

Tasks available for assignment: task1, task2

a1

a2a3

s1

g1

s2

g2

Page 50: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

CENTRAL: Candidate Parking Locations

Pickup locations s1 and s2 + three additional “good” parkinglocations, one for each agent:

s1

g1

s2

g2

Page 51: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

CENTRAL: Assignment of Parking Locations toAgents

CENTRAL uses Hungarian method to find a cost-minimalassignment from parking locations to agents (pickup locationshave priority over other parking locations):

Page 52: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

CENTRAL: Path Finding

CENTRAL plans collision-free paths for all agents from theircurrent locations to their assigned parking locations.CENTRAL plans paths to delivery locations only when agentsreach pickup locations.

s1

g1

s2

g2

Page 53: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Comparisons of Three Algorithms

a1

a2a3

s1

g1

s2

g2

s1

g1

s2

g2

s1

g1

s2

g2

s1

g1

s2

g2

Page 54: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Small Simulated Warehouse Environment

Figure: 21× 35 4-neighbor grid with 50 agents. Gray cells areinventory stations and storage locations. Colored circles are the initiallocations of agents.

Page 55: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Experimental Results: 500 Random Tasks, 10 to 50Agents

I Effectiveness1. Service Time:

CENTRAL < TPTS < TP2. Throughput – # tasks executed per 100 timesteps:

TP < TPTS < CENTRAL3. Makespan – timestep when all tasks are finished:

CENTRAL < TPTS < TPI Runtime per Timestep:

TP < 10 millisecondsTPTS < 200 millisecondsCENTRAL < 4,000 milliseconds

Page 56: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Large Simulated Warehouse Environment

Figure: 81× 81 4-neighbor grid with 500 agents.

Page 57: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Results for TP: 1000 Random Tasks, 100 to 500Agents

100 agents: ∼ 0.09 seconds per timestep500 agents: ∼ 6 seconds per timestep

agents 100 200 300 400 500service time 463.25 330.19 301.97 289.08 284.24

runtime (milliseconds) 90.83 538.22 1,854.44 3,881.11 6,121.06

Page 58: Lifelong Multi-Agent Path Finding for Online Pickup and ... · Lifelong Multi-Agent Path Finding for Online Pickup and Delivery Tasks Hang Ma Jiaoyang Li T. K. Satish Kumar Sven Koenig

Takeaways

MAPD: A “lifelong” version of multi-agent path finding.Three Algorithms:

I Decoupled and complete for well-formed MAPD instances:TP, TPTS.

I Centralized: CENTRAL.

Task Assignment Effort: TP < TPTS < CENTRALEffectiveness: TP < TPTS < CENTRALEfficiency: CENTRAL < TPTS < TP