10.1117/2.1201407.005526 improvingcoordinationof ... · has applications for clearing mines or...

10.1117/2.1201407.005526

Improving coordination ofunmanned vehiclesDan Shen, Genshe Chen, Haibin Ling, Khanh Pham, andErik Blasch

Game theory using physical constraints, time-delay feedback, andasymmetric information structures guides autonomous vehicles.

Unmanned ground vehicles (UGVs) have a variety of uses, in-cluding clearing land mines. However, their effectiveness maybe limited by the restricted field of view obtained by on-boardcameras and sensors. In adversarial or other challenging envi-ronments, it can be difficult for autonomous controllers to havesufficient knowledge to decide the best navigation strategy. Onesolution is to use an unmanned aerial vehicle (UAV) flying abovethe UGV to oversee a wider area. However, this requires theUAV and UGV to make coordinated autonomous decisions.

A mathematical tool to analyze situations, potentially in a con-flict situation, is a pursuit-evasion (PE) game.1–3 Such games areapplied in areas as varied as geometry and graphs,4, 5 sensormanagement,6, 7 collision avoidance,8, 9 and high-level informa-tion fusion.10 However, PE games are mostly implemented andtested by numerical simulations, where real-life physical con-straints, time-delay feedback, and computational feasibility arenot fully considered. Most multiplayer PE games assume that allplayers are equal, unlike when a UAV is coordinating UGVs. Totackle real-world limitations of PE game solutions, we designeda framework for this setup, as indicated in Figure 1. We built aPE game with the physical operating boundaries of UAV/UGVs,considered command delays, and tested whether optimal solu-tions were feasible.

In our framework, robots (used as UGVs) and a drone (usedas a UAV) are connected to a computer via a wireless local areanetwork (WLAN). Figure 2 shows the hardware connection dia-gram. Pursuers are represented by a computer (pursuer agent),which also hosts the three-player PE game. The pursuer versionof the game takes as inputs the states of the pursuers (measuredby the drone agent), tracked evader states (from the pursuers’local cameras), and the learned cost function (that is, evaluationof how best to move to catch the evader). Similarly, an evaderagent (such as the computational solver on a computer sending

Figure 1. A hardware-in-loop pursuit-evasion (PE) game framework.AR: Augmented reality.

commands to the robot) assesses the drone agent states. A track-ing module generates the pursuer states from the local cameraand the intents of the pursuers (modeled by the pursuers’ costfunctions) obtained from online learning schemes. A UAV agent(depicted as ‘controller’ in Figure 1) coordinates the drone move-ments and the visual entity tracking algorithms. In the dronecontroller, measurement and communication delays are takeninto account based on the drone dynamics model and the sup-ported drone commands.

To obtain the UGV robot states (location, movement, and in-tent), we designed and implemented visual tracking algorithmsover image sequences from the drone camera (global view) andthe robot cameras (local views). We designed several markers onthe robots and selected one that achieved the optimum balanceof position accuracy and tracking robustness for use in the PEgame’s theoretical robot control schemes. The target robots aredetected after background modeling, and the robot orientationis estimated from the local gradient patterns. Since the UGVs

Continued on next page

10.1117/2.1201407.005526 /4

are moving to perform pursuit-evasion missions, we designed aproportional-integral-derivative (PID) controller for guiding theUAV to follow the evader UGV. With delays in measurementchannels (camera and communication delays), the controllerreceives out-of-date information. We implemented a delaymeasurement compensation based on the history of robot move-ments and drone commands. Figure 3 shows the improved per-formance that results from the compensation. In parts (a) and(c) we have plotted the true position, the estimated position,and the measured position on the x- and y-axis, respectively.Measured positions are the (x,y) obtained from the images.Delays cause these measured values to be out of date, so we ad-just them for the known delay and previous movements to createthe estimated positions. In parts (b) and (d) we have plotted thedifference between the estimated position and the true position,with and without delay measurement compensation, for thex- and y-axis, respectively. It can be seen that results with com-pensation (blue line) demonstrate less position error.

Our testbed includes a derived three-player PE game testedwith real-world systems as a core component. In the PE gamemodel, there are two UGV pursuers and one UGV evader. Thereare two versions of the PE game model: one is hosted by thepursuers, named PPEG (pursuer PE game), and the other ismaintained by the evader, named EPEG (evader PE game).The PPEG is used to calculate the controls of pursuer robotsbased on the pursuers’ states, the tracked evader states, andthe learned evader’s intents (or evader’s cost function). TheEPEG will help the evader to obtain its control from the evader’sstates, tracked pursuers’ states, and the adaptively obtained pur-suers’ intents (or pursuers’ cost functions). Action-curve-basedsolutions have been developed to compute the mixed Nashequilibriums, which are probabilities for each action, assuming

Figure 2. Hardware connections.

Figure 3. Plots against time of the true (truth), measured (meas.) andestimated (est.) (a) x-coordinate and (c) y-coordinate. Plots comparingposition results for the (b) x-coordinate and (d) y-coordinate before andafter delay compensation (comp.) show that the compensation improvesthe position (x,y) accuracies of robots (evaders) from drone images withdelays. dx, dy: The difference (est. � truth) between the estimated x- ory-coordinate and the true value, respectively.

each player knows the equilibrium strategies of the other play-ers. Each game model includes the states of the correspondingplayer, the resolved possible strategies, and goals.

To perform system integration and visualization, we de-signed a graphical user interface (GUI)-based scenario manager,as shown in Figure 4. The hardware demonstrator com-bined video-based entity tracking algorithms; three-player game


10.1117/2.1201407.005526 /4

Figure 4. Our hardware demonstrator with graphical user interface and scenario. It monitors tracking algorithms, game models, learning algo-rithms, physical constraint compensation, and the unmanned aerial vehicle. IFT: Intelligent Fusion Technology Inc.

modeling with different information structures; learningalgorithms for robot dynamics; action-curve-based (mixed)Nash solutions of nonlinear PE games; sensor and communica-tion delay modeling and compensation; the PID controller forthe UAV drone; and the GUI-based scenario manager visualizer.Using the demonstrator, we tested various scenarios, differentcooperation strategies, and diverse boundary conditions.

In summary, we have developed a hardware testbed forautonomous networked robots (UGVs) with the help of a fly-ing drone (UAV) to validate PE game-theoretical solutions. Thishas applications for clearing mines or searching for a crashedplane, where a UAV providing aerial coverage could coordinatemultiple UGVs looking for mines faster than sending them outwith distributed (that is, noncoordinated) coverage plans. For aplane crash, a patrol plane monitoring search ships would vali-date their positions, rather than just having the ships use globalpositioning system coordinates.

Our testbed integrated robot dynamic models, entity-trackingalgorithms, sensor fusion methods, and a PE game demon-stration for three robots (two slower pursuers and one fasterevader). Based on the robot dynamic model and measured UGVstates, we designed a three-player discrete-time game model

with limited action space and limited look-ahead time horizonswith robot controls based on Nash solutions from game theory.We obtained promising results from the hardware-in-loopsimulations for real-time robot PE game-theoretical methods.In the next step, we will expand our testbed to include imper-fect wireless communication due to adversarial jamming andinterference.

This material is based on research sponsored by the Air Force ResearchLaboratory under agreement number FA9453-12-C-0228. The viewsand conclusions contained herein are those of the authors and shouldnot be interpreted as necessarily representing the official policies orendorsements, either expressed or implied, of the Air Force ResearchLaboratory or the US Government.


10.1117/2.1201407.005526 /4

Author Information

Dan Shen and Genshe ChenIntelligent Fusion Technology Inc. (IFT)Germantown, MD

Dan Shen received his MS and PhD in electrical and computerengineering from Ohio State University in 2003 and 2006. Hethen worked as a research scientist at Intelligent Automation Inc.(MD) and as a project manager at DCM Research Resources LLC(MD). He is currently a principal scientist at IFT, where his inter-ests include game theory and its applications, optimal control,and adaptive control.

Genshe Chen received BS and MS degrees in electrical engineer-ing and a PhD in aerospace engineering, all from Northwest-ern Polytechnical University, Xian, China. He has undertakenpostdoctoral research at Beihang University (China), WrightState University, the Technical University of Braunschweig (Ger-many), the Flight Division of the National Aerospace Laboratoryof Japan, and Ohio State University. In addition, he is CTO of IFTand has worked for over 20 years on electronic warfare, securecommunication, target tracking, guidance, navigation and con-trol of aerospace vehicles, decision making under uncertainty,space communication, and situation awareness.

Haibin LingDepartment of Computer and Information SciencesTemple UniversityPhiladelphia, PA

Haibin Ling is an associate professor at Temple University. Hereceived BS and MS degrees from Peking University, China, anda PhD in computer science from the University of Maryland.He has worked at Microsoft Research Asia, the University ofCalifornia, Los Angeles, and Siemens Corporate Research. Hisresearch interests include computer vision, medical imageanalysis, human computer interaction, and machine learning.

Khanh PhamSpace Vehicles DirectorateAir Force Research Laboratory (AFRL)Albuquerque, NM

Khanh Pham is a senior member of SPIE as well as of IEEE. He isan associate fellow of the American Institute of Aeronautics andAstronautics (AIAA) and has been nominated for many AFRLAchievement awards.

Erik BlaschUnited States Air ForceRome, NY

Erik Blasch received his BS from the Massachusetts Institute ofTechnology, and master’s degrees in mechanical engineering,industrial engineering, and health science from Georgia Tech.From Wright State University he has obtained an MBA, MSEE,MS in economics, and a PhD. Currently he is a principal scien-tist at the AFRL Information Directorate, leading programs ininformation fusion. He is an SPIE Fellow, associate fellow of theAIAA, and a senior member of IEEE.

References

1. R. Isaacs, Differential Games: A Mathematical Theory with Applications toWarfare and Pursuit, Control, and Optimization, Wiley, New York, 1965.2. Y. C. Ho, A. E. Bryson Jr., and S. Baron, Differential games and optimal pursuit-evasion strategies, IEEE Trans. Auto. Cont. AC-10 (4), 1965.3. T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory, Society forIndustrial and Applied Mathematics, 1998.4. L. Guibas, J. C. Latombe, S. LaValle, D. Lin, and R. Motwani, A visibility-basedpursuit-evasion problem, Int’l J. Comput. Geom. Appl. 4 (2), pp. 74–123, 1985.5. F. R. K. Chung, J. E. Cohen, and R. L. Graham, Pursuit-evasion games on graphs,J. Graph Theory 12 (2), pp. 159–167, 1988.6. M. Wei, G. Chen, E. Blasch, H. Chen, and J. B. Cruz Jr., Game theoretic multiplemobile sensor management under adversarial environments, Int’l Conf. Info. Fusion,2008.7. D. Shen, G. Chen, E. Blasch, K. Pham, C. Yang, and I. Kadar, Game theoretic sensormanagement for target tracking, Proc. SPIE 7697, p. 76970C, 2010. doi:10.1117/12.8508708. V. Isler, D. Sun, and S. Sastry, Roadmap based pursuit-evasion and collision avoidance,Proc. Robot. Sci. Syst., pp. 257–264, 2005.9. D. Shen, K. Pham, E. Blasch, H. Chen, and G. Chen, Pursuit-evasion orbital gamefor satellite interception and collision avoidance, Proc. SPIE 8044, p. 80440B, 2011.doi:10.1117/12.88290310. E. Blasch, E. Bosse, and D. A. Lambert, High-Level Information Fusion Man-agement and Systems Design, Artech House, Norwood, MA, 2012.

c 2014 SPIE

10.1117/2.1201407.005526 improvingcoordinationof ... · has applications for clearing mines or...

Documents