risk project design document

Upload: dirk-brand

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Risk Project Design Document

    1/27

    AI player for Risk

    Design Document

    Dirk Brand, 16077229

    September 26, 2013

  • 7/29/2019 Risk Project Design Document

    2/27

    1

    Contents

    1 Introduction 21.1 Game Phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    2 Framework 32.1 Protocol Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    2.1.1 Protocol Messages . . . . . . . . . . . . . . . . . . . . . . 52.1.2 Protocol Flow . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.2 Class Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2.1 Common Ob jects . . . . . . . . . . . . . . . . . . . . . . . 92.2.2 Facilitator . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2.3 Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2.4 Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    2.3 Method Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    2.4 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.1 Facilitator Connection . . . . . . . . . . . . . . . . . . . . 152.4.2 Pre-Game Screen . . . . . . . . . . . . . . . . . . . . . . . 172.4.3 Game Screen . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.5 Logging Functionality . . . . . . . . . . . . . . . . . . . . . . . . 20

    3 Computer Player 213.1 Submissive Player . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Baseline Player . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 Expectiminimax with Alpha-Beta Pruning . . . . . . . . . . . . . 223.4 Monte Carlo Tree Search . . . . . . . . . . . . . . . . . . . . . . . 23

    3.4.1 Stochasticity . . . . . . . . . . . . . . . . . . . . . . . . . 243.4.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 25

    4 Testing Design 25

  • 7/29/2019 Risk Project Design Document

    3/27

    2

    1 Introduction

    This document describes the design of the game framework of the Risk gameoutlined in this project. The design of a testing framework is also described.The main focus of the project is to investigate computer players for Risk, soemphasis is placed on the design of the algorithms that will feature in thecomputer players.

    Additional details can be found at the project website [10].

    1.1 Game Phases

    Our game of Risk consists of several phases outlined in Figure 1. The diagramillustrates how a our game of Risk transitions between game phases.

    Pre-game Setup Recruitment

    Battle

    Manoeuvre

    Figure 1: The game phases.

    The phases are:

    pre-game (P),

    setup (S),

    recruitement (R),

    battle (B), and

    manoeuvre (M).

  • 7/29/2019 Risk Project Design Document

    4/27

    3

    2 Framework

    The framework consists of the following components:

    The controller, which handles the playing of a game between two engines.

    The facilitator, which allows engines to connect to it and then initializesa controller with two engines to start a game.

    The graphical user interface (GUI) for human players.

    The GUI backend engine.

    Various computer based engines (AIs).

    A component that facilitates logging.

    A text-based communication protocol for managing communication be-tween the components.

    The framework structure will be based on a model described in the Go TextProtocol (GTP) [4]. The structure of the framework is shown in Figure 2.

    2.1 Protocol Design

    The communication protocol will be a text-based communication system basedon the GTP [4]. For both human players and AI players, the communicationbetween the engine and the controller or facilitator will take place over TCP/IPchannels. The communication could then take place over a network.

    Communication will be done in unicode, so the framework of the game could

    be implemented in different languages. Additional message text (for successand failure commands) should be in plain text.The protocol specifies various messages. A message can either be a request,

    issued by the controller, or a reply from an engine. If an engine is not able tohandle a certain request from the controller, it can reply with an error message.

    Request The structure of a request will be as follows:

    id request_type [arguments]

    The id field is an unique positive integer value and request_type is a string(without white space) describing the specific command, followed by a list ofarguments (possibly empty, comma separated).

    Reply The structure of a reply will be as follows:

    id = [responses]

  • 7/29/2019 Risk Project Design Document

    5/27

    4

    Controller Facilitator

    Engine(Human/AI)

    Engine(Human/AI)

    (a) Engines Connected to the Controller and Facilitator. All connections use TCP/IPas base for communication.

    GUI BackendEngine

    User Interface

    Human Player

    (b) Human interaction with the engine.

    Figure 2: Structure of the framework.

  • 7/29/2019 Risk Project Design Document

    6/27

    5

    The id field is again a positive integer value and should match the id valueincluded with the request that the reply is in response to. The responses field

    contains a list of values in response to the previous request (comma separatedand possibly empty). The = indicates a success. Note that some replies mayfunction as an acknowledgement of success without a list of responses (see thelist of requests in Figures 3 and 4).

    Error The structure of a failure reply will be as follows:

    id ? [#]error_message

    The [id] field is again a positive integer value and should match the id valueincluded with the request that triggered this error. The error_message fieldis a compulsory string that describes the reason why the request could not behandled (with an optional # symbol as prefix).

    Failure Mode If an error message starts with the # symbol, followed by amessage containing only the request type that triggered it, it indicates thatthe request was not expected by the engine. The controller does not recover, butinstead logs the failure and ends the game. If the engine sends an unrecognisedor unexpected message to the controller, the failure is logged and the gameends. If the controller or the engine reaches a network timeout, it also does notrecover but logs the failure. In all instances where the game ends, the log willstate that the misbehaving engine lost the game.

    2.1.1 Protocol Messages

    Different messages are used in different phases of the game and some message

    may be re-used in different game phases. The facilitator uses a specific set ofmessages for communicating with the engines. Once a game has been set up, acontroller is launched for the game and the controller then communicates withthe engines using a different set of messages.

    The two sets of request messages that will be handled by the protocol isshown in Figures 3 and 4 (appropriate responses and game phase usage are alsoshown).

  • 7/29/2019 Risk Project Design Document

    7/27

    6

    Request Expected Reply Game Phase Description.1 opponents [AI 1, AI 2, ...] = P Sends a list of available opponents to th

    engine (these include AIs and connectehuman players).

    2 maps [world map,starcraft map, ...]

    = P Sends a list of available game maps to thengine.

    3 start choices = playerName, opponent,map

    P Requests the engine to submit the initiaconfiguration of a game. The engine repliewith the appropriate choices made by thplayer.

    4 start game [Player1name,

    Player2name,chosenMap]

    = P Informs the engine of the configuration o

    the game and that a game is starting. Thfirst player name is the first player to plactroops and will have their turn first.

    Figure 3: Protocol for communication between the Facilitator and the engines.

    Request Expected Reply Game Phase Description.5 initial own territories

    [territory id1,

    territory id2, ...]

    = S Informs the engine of its initial allocateterritories. The territories not mentione

    belong to the opponent (the engine knowwhich territories, since it is specified in thmap file).

    6 place troops [number] = territory id1,number1, territory id2,number2, ...

    S, R Informs the engine of the number of troopthe player receives to place. The enginthen replies with a list of territories and thnumber of troops to place on each territory

    7 troops placed[territory id1, number1,

    territory id2, number2,

    ...]

    = S, R Informs the engine on which territoriean opponent placed troops and how mantroops the opponent placed on each terrtory.

    8 attack = source territory id,dest territory id

    B Requests the engine to provide a sourcand a destination for an attack. The en

    gine replies with a source and destinationor with an empty list if the player wants tstop attacking.

  • 7/29/2019 Risk Project Design Document

    8/27

    7

    9 attack result [attack dice1,attack dice2, attack dice3,

    defend dice1, defend dice2,source id, dest id]

    = B Informs the engine of the results of an at-tack. The message includes the individual

    dice rolls of the players (a 0 is returnedfor unused dice) and where the attack tookplace. The engine calculates the result ofthe attack and removes troops from terri-tories if necessary.

    10 manoeuvre = source territory id,dest territory id,number

    B, M Requests the engine to provide a sourceand a destination for a manoeuvre, as wellas the number of troops to move. The en-gine replies with a source, destination andnumber of troops. If the engine does notwish to move any troops, it provides anempty list. The request is used in both the

    manoeuvre phase as well as in the battlephase when a player defeats a territory be-longing to another player and then has tomove troops from the source of the attackto the defeated territory.

    11 move troops[source territory id,

    dest territory id, number]

    = B, M Informs the engine that the specified num-ber of troops have been moved from thesource to the destination.

    12 result [message] = ALL Informs the engine what the result of thegame was with the appropriate message(either a victory or a defeat). If the gameends as the result of a failure, the engine

    is not informed, but simply disconnected.The failure is logged.

    Figure 4: Protocol for communication between the Controller and the engines.

  • 7/29/2019 Risk Project Design Document

    9/27

    8

    2.1.2 Protocol Flow

    The protocol is known to the engines, the controller and the facilitator. Thus,when an engine receives a request message, it replies with the correct responseand the controller or facilitator would know to expect that response (and viceversa for the engine). This makes it easy to evaluate if an unexpected messagehas arrived. The facilitator only communicates with the engines during the pre-game phase, then the controller handles the communication with the engines inthe other four game phases.

    The flow of a typical game is shown in Figure 5 with blue edges indicatingthe flow of requests and replies for the engine currently playing their turn andred indicating the same for the other engine. State 12 (a result message) canbe reached at the end of the game, or by any state if the game fails at any pointin the game.

    1 2 3 4 5

    6

    7

    8

    9

    9 11

    10

    Figure 5: The usage of the protocol in our game of Risk.

    2.2 Class Design

    The facilitator and the controller components both have access to the protocolmanager that communicates with the engines. The engines have a separate

    protocol manager for communicating with the facilitator and controller. Theengines, the facilitator and the controller all have access to the Logger class.Class diagrams of the various classes are shown in Figures 6 to 9 with the

    individual method descriptions showing in Figure 10.All local variables in classes are private, but appropriate getters and setters

    are provided by default.

  • 7/29/2019 Risk Project Design Document

    10/27

    9

    2.2.1 Common Objects

    Some object classes that are used by multiple classes throughout the frameworkof the game. The GameState object is abstract and can be extended by usersthat wish to write their own engines and want to store additional informationin the GameState object.

    GameState (Abstract)

    - mapName : String- mapLocation : String- players : Player []- gamePhase : int- player1Territories : Territory []

    - player2Territories : Territory []- currentPlayerID : int

    + allocateTerritory(int player,int territory)+ placeTroop(int player,int territory)+ removeTroop(int player,int territory)

    Territory

    - name : String

    - ID : int- continent : String- continentID : int- neighbours : Territory[]- x : int- y : int- troops : int

    Player

    - name : String- id : int- color : Color- ipAddress : String- port : int

    ConnectedPlayer

    - playerInfo : Player- socket : Socket- output : OutputStream- input : BufferedReader

    + closeConnections()+ send(String mes)

    Logger

    - currentLog : File- debugLevel : int

    + log(int level, String message)

    ProtocolManager

    - messages : String []- clients : ConnectedPlayer []

    + sendCommand(int destID, int id,String command, String arguments)

    Figure 6: Common objects used by other classes.

    2.2.2 Facilitator

    The facilitator allows engines to connect to the controller and once two enginesstart a game, launches the controller with the two connected players and theirchoices made in the pre-game phase. The structure of the facilitator is shownin Figure 7.

  • 7/29/2019 Risk Project Design Document

    11/27

    10

    FacilitatorLogic

    - PM : ProtocolManager- log : Logger

    + main()+ readAIOpponents(String path)+ connectPlayer(Socket socket, int id)+ disconnectPlayer(int id)

    Figure 7: Structure of the facilitator.

    2.2.3 Controller

    The controller keeps the game state and performs all the logic of the game(management of turns, simulation of dice rolls, management of game phases).The structure of the controller is shown in Figure 8.

    ControllerLogic

    - game : GameState- PM : ProtocolManager- log : Logger- randomSeed : long

    + ControllerLogic(ConnectedPlayer p1,ConnectedPlayer p2, String mapName,Logger log)

    + playGame()+ loadMap(String mapPath)+ resolveAttack()- genRoll()

    Figure 8: Structure of the controller.

    2.2.4 Engine

    The Engine consists of an EngineProtocolManager to process all requests fromthe controller and send relevant replies, and an EngineLogic unit that handlessaid requests by either making changes to the local game state, or retrieving theinformation requested by the controller. The EngineLogic unit has a local copy

    of the game state.The controller, facilitator and engines all have access to the same set of

    map files. If an engine is outdated and does not have a map specified by thefacilitator and the controller, it should be updated by getting the latest mapsfrom the project website [10]. All map files will be in the same format as theformat used in the Risk implementation by Yura [7].

  • 7/29/2019 Risk Project Design Document

    12/27

    11

    The structure of the engine is shown in Figure 9.

    EngineLogic

    - game : GameState- epm : EngineProtocolManager

    + main()+ establishControllerConnection

    (String address, int port)+ playGame(Player p1, Player p2)+ loadMap(String mapName,

    String mapURL)+ troopPlaced(int [] territory ids,

    int [] numberOfTroops)

    + recruitment(int number)+ resolveAttack(int a1, int a2,int a3, int d1, int d2)

    + transferTerritoryControl(int territory id, int player id1,int player id2)

    + setOppManoeuvre(int territory id1,int territory id2, int number)

    EngineProtocolManager

    - replies : String []- controllerIP : String- controllerPort : int- controllerSocket : Socket

    + process(String m)- sendSuccess(int id,

    String arguments)- sendFailure(int id,

    String message)

    Figure 9: Structure of the engine.

  • 7/29/2019 Risk Project Design Document

    13/27

    12

    2.3 Method Summary

    All methods have a void return type, unless specified otherwise.

    Class / ReturnType

    Method Description

    GameStateallocateTerritory(int player id,

    int territory id)

    Allocates the specified territory to the player (addsthe territory to the players list of territories).

    placeTroop(int player id, int

    territory id)

    Increments the number of troops at the territory be-longing to the specified player.

    removeTroop(int player id, int

    territory id)

    Decrements the number of troops of the territory be-longing to the specified player.

    ConnectedPlayercloseConnections() Close the Socket, OutputStream and BufferedReader

    associated with ConnectedPlayer.send(String message) Send the specified message via the Socket of the Con-

    nectedPlayer.

    Loggerlog(String message, int level) Checks if the specified level is less than or equal to

    the logging level of the Logger instance and if so,appends the message to the log.

    ProtocolManagersendCommand(int destID, int

    id, String command, String

    arguments)

    Sends a message to the ConnectedPlayer that cor-responds to the destID. The message consists of theid value (if positive), the command String and thearguments String (unless empty).

    ControllerLogicControllerLogic(ConnectedPlayer

    p1, ConnectedPlayer p2, string

    mapName, Logger log)

    Constructor for the ControllerLogic class. Initiatesthe controller with two ConnectedPlayers and amap to play. The controller is also initialised with aLogger object, as it keeps the same log as the facili-tator.

    playGame() Starts a game at the setup phase. The method willcycle through the game phases and the player turnsand communicate with the engines while updatingthe game state when changes happen in the game.

    loadMap(String mapPath) Loads the map from the file at the mapPath. Willgenerate a list of territories from the map file and

    allocate to players randomly.resolveAttack() Gets dice rolls and removes troops from the appropri-ate territories in the game state, then informs playersof the result.

    int genRoll() Generates a random number between one and six.

  • 7/29/2019 Risk Project Design Document

    14/27

    13

    FacilitatorLogicmain() The starting point of the facilitator. Initialises the

    ProtocolManager and Logger global objects, opensa ServerSocket and waits for players to connect toit.

    readAIOpponents(String path) Reads the list of AI opponents in the directory atthe given path and adds each of them to the list ofConnectedPlayers in the protocol manager kept bythe facilitator.

    connectPlayer(Socket socket, int

    id)

    Adds a ConnectedPlayer to the list of connectedplayers in the protocol manager with the given socketand id value.

    disconnectPlayer(int id) Removes the ConnectedPlayer corresponding to thegiven identifier from the list of connected playersand calls the closeConnections() method insidethe ConnectedPlayer class.

    EngineLogicmain() The starting point of the engine. Initialises the ob-

    jects and fields of the class. Attempts to connect tothe controller and on a successful connection, awaitsfurther instructions from the controller.

    establishFacilitatorConnection

    (String address, int port)

    Connects to the facilitator. Throws an IOExceptionif the port is already bound to another connection oran UnknownHostException if no available connectionis found at the specified address.

    playGame(Player p1, Player p2) Initializes the game phases, starting with the setupphase. All messages from the controller will be re-

    ceived and processed, with relevant replies sent back.loadMap(String mapPath) Loads the map from the file at the mapPath to dis-

    play in the user interface. It will process the territo-ries listed in the map file and allocate territories toplayers based on the list sent by the controller withthe initial own territories message. Throws aFileNotFoundException if the map file at the pro-vided path does not exist.

    troopsPlaced(int []

    territory ids, int []

    numberOfTroops)

    After receiving the troops placed message from thecontroller, this method will be called to update thegame state.

    recruitment(int number) Resolves the recruitment of the given number oftroops during the recruitment phase. The player se-lects territories where to place recruited troops andthe engine sends the selection to the controller.

    resolveAttack(int a1, int a2,

    int a3, int d1, int d2)

    Resolves the attack between the chosen source anddestination territory. The first three integer param-eters are the values of the attackers dice and theother two the value of the defenders dice (a valueof zero for unused dice). The result of the battle iscomputed and the appropriate troop numbers of theterritories involved in the attack are updated in thegame state.

  • 7/29/2019 Risk Project Design Document

    15/27

    14

    transferTerritoryControl(int

    territory id, int player id1, int

    player id2)

    Removes the territory from the list of territories be-longing to first player and adds it to the list of terri-

    tories belonging to the second player.setOppManoeuvre(int

    territory id1, int territory id2,

    int number)

    Decrements the number of troops of the source bythe specified number, then increments the destina-tion number of troops with the same number.

    EngineProtocol-Manager

    process(String m) Process the message sent by the controller by callingthe relevant methods and sending the correct reply.

    sendSuccess(int id, String

    arguments)

    Sends a success message to the controller on the con-trollerSocket. The message will consist of the idvalue, if a positive id parameter is given, a = and thearguments (unless an empty parameter is provided).

    sendFailure(int id, Stringmessage) Sends a failure message to the controller on the con-trollerSocket. The message will consist of the idvalue, if a positive id parameter is given, a ? andthe message String.

    Figure 10: Method descriptions.

  • 7/29/2019 Risk Project Design Document

    16/27

    15

    2.4 User Interface

    The user interface will be created using the Java Swing API. The user interfacewill consist of only two windows. These will include a window to set up a game(selecting an opponent, selecting the game map, choosing a name and choosing acolour) and a window that will function as a front-end for the game engine. Thiswill contain a panel for displaying information as well as a panel that displays themap and the various troop allocations per territory. Some prototype screenshotsare shown in Figures 11 to 13. Each of the prototype interfaces consists ofvarious components. These components will also be discussed.

    2.4.1 Facilitator Connection

    Before any game playing can commence, the facilitator has to be launched.Since communication can be done over the network, the facilitator can be on

    either of the human players machines, or on a different remote machine. Anyhuman player that wishes to play a game, must first establish a connection tothe facilitator.

    The facilitator can be used in two ways. Either a player connects to thefacilitator and creates a game with their choices of name, map and opponent(either an AI or a player that has already connected to the facilitator) or they

    join a game that has been created by another player. The facilitator could alsobe circumnavigated and the controller could be launched directly by a handlerthat will set up a game between two engines. This will be used when testing AIengines.

    The window in Figure 11 will co-ordinate the connection procedure. Theplayer need only provide the internet address (can be localhost), the porton which to connect to the controller and the proxy (if left blank, the default

    system proxy is used). Once a player has connected to the controller, he/she isregistered, so when another human player connects to the controller, they willbe able to choose any previously connected players as opponents.

  • 7/29/2019 Risk Project Design Document

    17/27

    16

    (a) Prototype GUI.

    (b) Warning message if port value not filled in.

    Figure 11: Connecting to the controller.

    Figure 11 consists of three JLabels, three JTextFields and a JButton. Auser can enter text into the text fields, then press connect. If any of the fieldsare uncompleted, an error message will be displayed (like in Figure 11b).

  • 7/29/2019 Risk Project Design Document

    18/27

    17

    2.4.2 Pre-Game Screen

    Once a player has connected to the facilitator, the next window will allow aplayer to choose a name, select which opponent they wish to play against, selecta map to use in the game and choose the colour each player will be representedby. The prototype for this window is shown in Figure 12a.

    (a) Prototype for the pre-game phase window.

    (b) Warning message if user did not fill in a name.

    Figure 12: The Pre-game Screen

    The pre-game screen consists of a JList that shows the list of availableopponents, another JList with the list of available maps, two JColorChoosersthat the user can use to associate players with colours, a JTextField that takes

  • 7/29/2019 Risk Project Design Document

    19/27

    18

    the players name, and a JButton that allows the user to proceed to the nextwindow. If the player does not enter a name, chooses an opponent or chooses a

    map, an error message will be displayed (like in Figure 12b).

    2.4.3 Game Screen

    The game screen will show a panel that represents the actions a player may takeduring each phase of the game as well as the map with the troop allocations perterritory. The actions a player takes might influence the game state and whatwill be displayed on both the panel and the map. When it is not a players turn,the player will be informed of their opponents actions in this window.

    As is shown in Figures 13a, 13b and 13c, the panel on the left shows differentcontent for the different game phases.

    (a) During the setup and recruitment phases.

    (b) During the battle phase.

  • 7/29/2019 Risk Project Design Document

    20/27

    19

    (c) During the manoeuvre phase.

    Figure 13: The front-end for the engine

    The map is an image, with circles (indicating which player owns a territory)and numbers (indicating the number of troops currently on the territory) drawnon it with methods from the Java Swing API.

    The recruitment panel consists of a list of territories that belong to the playercurrently playing their turn, with JTextBoxes where the player indicates howmany troops to recruit to each territory. When the player is satisfied with theirchoices and all the recruited troops have been placed, they continue to the nextphase by pressing the JButton that says Done.

    The battle panel contains of two JLabels saying Source and Destination,with two JComboBoxes that functions as a list of territories the player can choosefrom. The battle panel also contains a JTextArea that displays all battle results(like the outcome of dice rolls) and JButtons that allow the player to Attack

    Again, to Manoeuvre troops from the source of the attack to the destination,or to end the battle phases by pressing the Done button. The Manoeuvrebutton takes the player to the manoeuvre window (Figure 13c), but with thesource and destination fixed.

    The manoeuvre panel has three JLabels, two JComboBoxes, a JSpinner anda JButton. The comboboxes, like in the battle panel, allow the player to selecta territory from a list of territories. The spinner allows the player to specify howmany troops should be moved from the source to the destination. The playerthen presses the Done button to end their turn (if in the manoeuvre phase) orto continue attacking (if still in the battle phase).

    If, during any of the phases, the player makes illegal selections (like at-tempting to move troops between two territories that are not adjacent), anerror message will be displayed like in Figure 14.

  • 7/29/2019 Risk Project Design Document

    21/27

    20

    Figure 14: Player attempted to manoeuvre troops between to disconnectedterritories.

    2.5 Logging Functionality

    The controller, facilitator and the engines will all have access to the loggerand will periodically generate information and store it in a log file that can beconsulted after a game has been played. The facilitator and controller will haveone log file, while the engines each have their own. The log that the controllerand facilitator keep can be used for all levels of logging, while the log thatan engine keeps could only be used for debugging (as it does not have enoughinformation to replay a game).

    Different levels of logging can be done. These levels also subsume one an-other, so level 30 will also contain the information of levels 10 and 20. Thisis not strictly true for the log the engines keep, as they will only be able todo logging for debugging that will not contain all the information from replaylevels.

    The levels are:

    10. minimal,

    20. replay,

    30. debug (input and output),

    40. debug (Method invocations), and

    50. debug (All variables).

    The structure of the predetermined levels allows a user to set custom logginglevels between existing levels when implementing their own engine.

    A user can put different invocations of the log method (specified in Figure10) with a specific level in different parts of the code. If the level is less than orequal to the level of the logger, the message will be logged. So the log will onlyshow as much as the user has set the log to show.

    The log will, for instance, show only the pre-game details and the result ofthe game if the logger is set to the minimal level. If the replay level is set,all the information of the setup and the various player turns and game phases

  • 7/29/2019 Risk Project Design Document

    22/27

    21

    will be shown. If the debug option is selected, the log will show which methodswere called and from where the call originated (in addition to all the replay

    information).If the game fails at any point, the failure is logged and the log is closed.

    3 Computer Player

    Four computer players will be implemented:

    A submissive AI that only defends and never attacks.

    A baseline AI that plays according to a greedy scheme.

    An AI making use of expectiminimax tree search with alpha-beta pruning.

    An AI based on Monte Carlo Tree Search.

    To allow testing of these computer players against the AI players imple-mented by Yura [7], the interface specified by Yura will be implemented foreach AI. Some of the methods might not necessarily be used by the AI players(e.g. trading is not implemented in the version of Risk used in this project),but will return default values for the purpose of robustness when inserted intoYuras framework.

    interface

    AI Player

    - game : GameState- id : int

    - name : String+ setGame(GameState game)+ getType()+ getCommand()+ getBattleWon()+ getTacMove()+ getTrade()+ getPlaceArmies()+ getAttack()+ getRoll()+ getCapital()+ getAutoDefendString()

    AI Baseline AI Submissive AI MCTS AI ExpectiMinimaxAlphaBeta

  • 7/29/2019 Risk Project Design Document

    23/27

    22

    3.1 Submissive Player

    The submissive player will make defensive choices based on a basic strategy.During the setup phase, it will place a single troop on each territory until eachterritory has two troops, then a third, etc. During the recruitment phase, it willplace half of the recruited troops on the territory it owns with the least numberof troops, and the other half on the territory with the second least number oftroops. During the battle phase, it will never attack, but only defend. Duringthe manoeuvre phase, it will move troops from the territory with the mosttroops to the territory with the least number of troops in such a way that thetwo territories will have the same number of troops after the manoeuvre.

    3.2 Baseline Player

    The baseline player will play according to a simple greedy scheme. During

    the setup phase, it will behave like the submissive player and repeatedly placea single troop on every territory until no more troops are left. During therecruitment phase, it will place all of its troops on the territory it owns wherethe ratio of troops on the territory to troops on neighbouring enemy territories isthe highest (e.g. if there are three troops on the territory and the opponent hasfour neighbouring territories each with two troops, the ratio is 3

    24= 3

    8= 0.375).

    In the event of a tie, a territory is chosen at random. During the battle phase,it will repeatedly attack from the territory it placed its recruited troops on, tothe neighbouring territory with the least number of troops and will either keepattacking until the troops run out, or until the territory is defeated, in whichcase it will move all remaining troops to the defeated territory and repeat theprocedure from the new territory. It will follow the same manoeuvre scheme asthe submissive player.

    3.3 Expectiminimax with Alpha-Beta Pruning

    The expectiminimax algorithm is a recursive algorithm that builds a game tree.The game tree is a graph consisting of nodes representing different states of thegame and directed edges between nodes representing the actions that changesone game state to another. These actions are choices based on the current gamestate at a node or stochastic events. Expectiminimax makes use of heuristicswhen evaluating the value of the game state at a given node.

    Alpha-Beta pruning seeks to improve expectiminimax by decreasing thenumber of nodes that are evaluated. In principle, it stops investigating a branchof the game tree that cannot do better than a previously determined best value.In that case, the branch is pruned and the search continues on a different

    branch.The code for the expectiminimax tree search with alpha beta pruning algo-

    rithm is shown below. The initial call to Emm AB(node,depth,,,maxPlayer)will be with the current game state as starting node, a predefined depth tosearch to, = , = and maxPlayer = true (indicating that in the

  • 7/29/2019 Risk Project Design Document

    24/27

    23

    current game state, the current player wishes to maximise his/her gain). Themethod nextPlayer(node) is a boolean method that determines which player is

    next by examining the game state at the node. It could be the same player, inwhich case the method would return true, or the next player, in which case themethod would return false.

    1: procedure Emm AB(node,depth,,,maxPlayer)2: if node is terminal node then3: return Result of game4: end if5: if depth == 0 then6: return Heuristic value of node7: end if8: if maxPlayer then Return value of max child9: for all children of node, as child do

    10:

    := max(, Emm AB(child,depth

    1, ,, nextP layer(node))11: if then12: break13: end if14: end for15: return 16: else if !maxPlayer then Return value of min child17: for all children of node, as child do18: := min(, Emm AB(child,depth 1, ,, nextP layer(node))19: if then20: break21: end if22: end for

    23: return 24: else if random event at Node then25: let a := 026: for all children of node, as child do27: a := a+ (Probability[child] * Emm AB(child,depth1, ,, nextP layer(node))28: end for29: return a30: end if31:

    32: end procedure

    3.4 Monte Carlo Tree Search

    This algorithm also involves building a game tree, but unlike expectiminimaxwhere the entire tree is built and then pruned with alpha-beta pruning, thisgame tree is only built until the allowed computation time has expired (usu-ally a predetermined time). The tree consists of nodes (representing the gamestate at different time steps) and directed edges (representing an action beingperformed).

  • 7/29/2019 Risk Project Design Document

    25/27

    24

    Each iteration [1] of the algorithm consists of the following phases:

    Selection - Recursively select nodes until an expandable node is reached.A node is expandable if it has children that have not been visited and isalso not a terminal node.

    Expansion - Add one or more child nodes to the tree from the expandablenode. The children added depend on what actions are available at theexpandable node.

    Simulation - A simulation is performed from the newly added node(s).The simulation typically involves playing a game until the game finishes.

    Backpropagation - The result of the simulation (which player won thegame) is propagated back to the root node and every intermediate nodesstatus is updated. Each node carries a ratio of games won to games lost

    (by simulating games from the node) and this ratio gets updated in thebackpropagation phase. Each node also carries a count of how many timesit has been visited, which also gets updated in the backpropagation phases.

    The algorithm is iterated while the computing time has not expired. Whenthe algorithm stops, the actions that lead from the root are considered and thebest performing action is returned. The best performing action is defined aseither the action from the root node that leads to the child with the highestwin to loss ratio (max reward child) or the child that has been visited themost (robust child). For using MCTS with Risk AIs, the robust child will beconsidered.

    3.4.1 Stochasticity

    Each simulation contains stochastic elements, since there are random dice rollsinvolved during the attack phase.

    Some nodes will have choices that lead to nodes with stochasticity (e.g. if theplayer chooses to attack, the choice would lead to a dice roll). It is not yet clearhow stochasticity will be handled in the MCTS algorithm. The algorithm needsto be adapted to handle the stochasticity in the Selection and Backpropagationphases.

    During the selection phase, when the selection reaches the node with stochas-ticity, the edges from the node will be the various outcomes with relevant prob-abilities, and an all the edges of a stochastic node could be expanded (sincethey would all be possible). During backpropagation, the results of expandingand simulating along one of these edges could be added to current results at the

    node and weighted according to the probability of the outcome of the edge.Using selection and backpropagation in this way would result in a much wider

    game tree. This will be thoroughly investigated during the implementation andtesting of the MCTS algorithm.

  • 7/29/2019 Risk Project Design Document

    26/27

    25

    3.4.2 The Algorithm

    The pseudo code of the algorithm looks as follows [2]:1: procedure MCTSSearch(Node root node)2: while time not expired do3: current node root node4: while current node ST do Selection5: last node current node Selection6: current node Select(current node) Selection7: end while8: last node Expand(last node) Expansion9: R Simulate(last node) Simulation

    10: while current node ST do11: current node.UpdateRatio(R) Backpropagation12: current node.visit count++ Backpropagation13: current node current node.parent Backpropagation14: end while15: end while16: return Action(BestChild(root node))17: end procedure

    4 Testing Design

    The framework will be tested in several ways. These are listed below.

    Unit testing for the methods used in the facilitator, controller, engine andAI classes, as well as in the common objects.

    Integration testing of the various classes and also the interaction of thedifferent components over the network.

    Code coverage testing of all code used in the project.

    User Interface testing of all the windows used in the user interface.

    Testing of the computer players through a combination of the evaluationplan and the testing plan.

    The unit and integration testing can be organises with a tool like the ApacheMaven Project [9] which organises a project build schedule and can split unitand integration tests to build and execute at different times. The tests canbe automated with tools like CodePro Analytix [8]. For code coverage, tools

    like EclEmma [6] and eCobertura [3] could be used. The user interfaces can betesting with some behaviour testing using a tool like FEST [5].

  • 7/29/2019 Risk Project Design Document

    27/27

    26

    References

    [1] Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas,Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyri-don Samothrakis, and Simon Colton. A Survey of Monte Carlo Tree Searchmethods. Computational Intelligence and AI in Games, IEEE Transactionson, 4(1):143, 2012.

    [2] Guillaume M JB Chaslot, Mark HM Winands, Jaap Van Den Herik,Jos WHM Uiterwijk, and Bruno Bouzy. Progressive strategies for Monte-Carlo tree search. New Mathematics and Natural Computation, 4(03):343357, 2008.

    [3] eCobertura by jmhofer. http://ecobertura.johoop.de/, Accessed: 24May 2013.

    [4] Gunnar Farneback. Specification of the Go Text Protocol. Version 2, Draft2, October 2002.

    [5] FEST: Fixtures for Easy Software Testing.https://code.google.com/p/fest/, Accessed: 24 May 2013.

    [6] Java Code Coverage for Eclipse. http://www.eclemma.org/, Accessed: 24May 2013.

    [7] Domination (Risk Board Game). http://sourceforge.net/p/domination/wiki/Home/,Accessed: 27 Mar 2013.

    [8] CodePro Analytix User Guide. https://developers.google.com/java-dev-tools/codepro/doAccessed: 24 May 2013.

    [9] Apache Maven Project. http://maven.apache.org/index.html, Ac-cessed: 24 May 2013.

    [10] Risk AI Project. https://sites.google.com/a/ml.sun.ac.za/risk-ai/,Accessed: 21 May 2013.