daang jover thesis 8

17
Using Bees Algorithm to Control Difficulty in Video Game AI DAANG, JESSA, ATENEO DE DAVAO UNIVERSITY JOVER, GREGG MEYRICK, ATENEO DE DAVAO UNIVERSITY Write you abstract here. General Terms: Clinical Practice Guidelines, Constraint Logic Programming, Case-Based Reasoning Additional Key Words and Phrases: Comorbidity, Clinical Decision Support System 1. INTRODUCTION 1.1 Background of the Study Bees Algorithm (BA) is an algorithm that simulates how bees forage resources in a given area. In its basic form, scout bees that find a food source immediately return to the hive and share this information with other bees by performing the "waggle dance". This dance contains three important information regarding the food source: the direction of the food source which it can be found, its distance from the hive and its quality rating (fitness). After performing the dance, the scout bees go back to the food along with a select number of bees that were waiting inside the hive. If there are more promising food sources, more follower bees are sent in order to harvest resources effectively and efficiently [1]. BA has been used to solve many different types of problems. Alzaqebah et al. [2] used BA and its modified version called probability BA (PBA) to solve examination timetabling problems. These problems involved allocating a number of examinations into pre-defined timeslots, wherein the hard constraints cannot be violated and the soft constraints minimized as much as possible. The algorithm determined how to resolve conflicts in exam schedules while finding optimal schedules for students. BA has also been applied on single job scheduling problems wherein a number of jobs in a machine have to run without interruption. Factors such as tardiness, earliness, lateness and flowtime have to be considered in running he jobs. In addition, an optimal schedule should have no idle time between consecutive jobs, has a V-Shape property (jobs before or after the job with the shortest processing time are arranged in non- increasing or non-decreasing order of processing times), and follows a condition where the processing time of the first job starts at time zero or one job finishes exactly at the due date. Pham et al. [3] used a BA version of the problem, wherein the foraging process of the bees was adapted to finding an appropriate idle time between the first and concurrent jobs. The results obtained from the experiment showed that BA performed strongly than other existing techniques. The idle time adjustment at each iteration allowed for more accurate results. 1

Upload: gregg

Post on 30-Jan-2016

218 views

Category:

Documents


0 download

DESCRIPTION

Thesis

TRANSCRIPT

Page 1: Daang Jover Thesis 8

Using Bees Algorithm to Control Difficulty in Video Game AIDAANG, JESSA, ATENEO DE DAVAO UNIVERSITY

JOVER, GREGG MEYRICK, ATENEO DE DAVAO UNIVERSITY

Write you abstract here.

General Terms: Clinical Practice Guidelines, Constraint Logic Programming, Case-Based Reasoning

Additional Key Words and Phrases: Comorbidity, Clinical Decision Support System

1. INTRODUCTION

1.1 Background of the Study

Bees Algorithm (BA) is an algorithm that simulates how bees forage resources in a given area. In its basic form, scout bees that find a food source immediately return to the hive and share this information with other bees by performing the "waggle dance". This dance contains three important information regarding the food source: the direction of the food source which it can be found, its distance from the hive and its quality rating (fitness). After performing the dance, the scout bees go back to the food along with a select number of bees that were waiting inside the hive. If there are more promising food sources, more follower bees are sent in order to harvest resources effectively and efficiently [1].

BA has been used to solve many different types of problems. Alzaqebah et al. [2] used BA and its modified version called probability BA (PBA) to solve examination timetabling problems. These problems involved allocating a number of examinations into pre-defined timeslots, wherein the hard constraints cannot be violated and the soft constraints minimized as much as possible. The algorithm determined how to resolve conflicts in exam schedules while finding optimal schedules for students. BA has also been applied on single job scheduling problems wherein a number of jobs in a machine have to run without interruption. Factors such as tardiness, earliness, lateness and flowtime have to be considered in running he jobs. In addition, an optimal schedule should have no idle time between consecutive jobs, has a V-Shape property (jobs before or after the job with the shortest processing time are arranged in non-increasing or non-decreasing order of processing times), and follows a condition where the processing time of the first job starts at time zero or one job finishes exactly at the due date. Pham et al. [3] used a BA version of the problem, wherein the foraging process of the bees was adapted to finding an appropriate idle time between the first and concurrent jobs. The results obtained from the experiment showed that BA performed strongly than other existing techniques. The idle time adjustment at each iteration allowed for more accurate results.

There is no research indicating the use of BA in video game Artificial Intelligence (AI). The scout bees searching for flower patches are essentially a way for the bees to communicate with the other bees in the hive. This style of communication can be incorporated into a game AI, making the other bees work together in swarms. The way they locate the player is by having scout bees patrol a random area through a grid overlaying the game environment. Reinforcements can be called by having at least one scout bee return to the hive to communicate with the other bees in order to defeat the player. By controlling the number of bees and flower patches, one can control the total number of bees that the player should kill or avoid in order to survive. Therefore, by controlling the effectiveness of the communication between other bees, one could control the difficulty of the game.

1

Page 2: Daang Jover Thesis 8

1:2 • Authors name here. Format is first name initial followed by full last name (e.g., D. Pineo, C. Ware and S. Fogarty).

1.2 Problem Statement

The main problem of the study is how to implement BA in an 2-D AI game.

The specific problems of the study are as follows:(1) How does one control the number of bees and flower patches in the game to make the

playing experience more challenging?(2) How does one update the position of the player when the algorithm is running?(3) How will BA generate possible

1.3 Objectives

The study seeks to conduct a simulation on how BA performs in a simple 2-D game and to know more about how game developers implement AI on non-playable characters.

The specific objectives of the study are as follows:(1) Implement a simulation of the bees and the player through a grid.(2) Control the difficulty of the game relative to how the player performs in the simulation.

1.4 Significance of the Study

The study is worth exploring upon because it would help game developers create better AI in order to help challenge the player. Since there were no existing research on Artificial Intelligence using BA, the study would also serve as an opportunity for further exploration by other researchers.

1.5 Scope and Limitations

The study is only limited in a two-dimensional (2-D) for the simple purpose of conducting a simulation of the game using BA. Since it is a 2-D game, simple graphics would be used.

2. REVIEW OR RELATED LITERATURE

2.1 Overview of Video Gaming

Player satisfaction is influenced by different factors such as the Graphical User Interface (GUI), background story, game difficulty and others. The game should be able to provide a challenge to the player without being excessively difficult or easy. There were two strategies to match up players according to their skill level: player-versus-player (PVP) and player-versus-environment (PVE) scenarios. In PVP scenarios, difficulty may vary because the player is battling against non-AI entities, who, more or less, have better skills, equipment and game experience than the player they are facing against, which is why the game attempts to find opponents whose skill level is relative to the player. The popular MOBA game known as Dota 2 uses a metric known as Matchmaking Record (MMR) in order to match up players with a supposedly similar skill level, as well as game experience. League of Legends (LOL), another MOBA game, uses Leagues to determine which skill group a player belongs. This feature is implemented in order to find games faster, while considering the relative skill level of other players in a match.

However, MMR in Dota 2 is a metric. Winning a game adds additional MMR points to the player profile, while losing a game decreases MMR points from the player. This means that player skill does not necessarily reflect upon his/her current MMR rating. A player may be

Page 3: Daang Jover Thesis 8

Title of Proposal here • 1:3

skilled enough and tries to improve his/her MMR rating, but because of other debilitating factors (e.g. unskilled teammates, delay, really strong player opponents), the player would lose games, thus decreasing his/her MMR rating. A disadvantage regarding MMR is that very skilled players would deliberately decrease their rating or create a new account. This will allow them to play with relatively new, unskilled players which may cause inconvenience to the opposing team. This is commonly referred to as "smurfing" and is generally frowned upon by the Dota 2 community because of skill imbalance. Since LOL does not have metric representation of player skill and Leagues are skill groups that determine "where players are", some players would find themselves teamed up with lesser skilled members, which the community describes these players as "toxic", because their lack of experience with the game causes games to become difficult, therefore ruining the experience of other team members. Both game metrics could be controlled by players in order to find an enjoyable gaming experience or to ruin the gaming experience of the opposing team. Thus, MMR and Leagues are not exactly accurate representations of player skill, but it is a close figure of how the player performs in-game.

In PVE scenarios, the player finds him/herself battling against AI entities or the game environment itself. The player can choose among the pre-defined and static difficulties (e.g. Easy, Medium, Hard) before playing the game. The problem with such a paradigm is that novice and expert players can become frustrated and bored from the pre-defined difficulty levels. The discrete difficulty levels do not consider the skill level of individual players or the times it takes for a player to learn about the game [4]. In addition, some difficulty levels may be extremely hard to complete, even though the player has all the gear, levels and other things. The game Resident Evil 5 includes a Professional difficulty that immediately kills the player by only one hit from any enemy, in addition to not being able to be revived by his/her teammate. Some difficulty levels would cause a permanent change in the gaming experience of the player. In one example, the Role-Playing Game (RPG) Diablo III features the Hardcore difficulty that does not necessarily make the game more difficult, with the disadvantage that the character used will be permanently deleted from the game. Because of this, players would be more careful when playing at Hardcore difficulty in order to avoid the inconvenience of having to start a new character all over again.

Game developers have devised two additional kinds of adjustments in managing game difficulty. The first involves controlling the amount of resources that a player may acquire. By controlling the number of items, the player would have to play conservatively in order not to exhaust his/her resources. The second involves adjusting the quality of AI-controlled enemies. Most current games use AI-like strategies to dynamically adjust the game difficulty based on the skill of the player. The difference between dynamic difficulty control from existing methods is that the environment or player adversaries (or both) are in a constant state of flux, so that when the player becomes better, the game becomes harder and when the player is struggling, the game adjusts to make the walkthrough easier.

A computer-generated opponent is known as an agent or an Artificial Intelligence (AI). Their appearance is most common in single-player and PVE games. AI opponents also appear in multiplayer games, though their role is limited such that they are merely distractions or being helpful allies. Dota 2 uses Creeps in order to win areas, earn gold and experience, and overwhelm their counterparts. Players could not destroy defensive structures without their presence, which is why they are important units in the game. Modern games have complex AIs that include methods for carrying out actions and a controller for determining which actions to perform. These controllers have been implemented as finite-state machines, hierarchical decision trees, and goal oriented action programming, which uses a variation of the A* shortest-path algorithm to find appropriate actions [5].

BA is a population-based method that could find use in game controllers that adjust to player skill level. BA is a type of swarm-based optimization algorithm that finds locally optimal solutions in a given area. Scout bees do a random search of the area trying to find patches.

Page 4: Daang Jover Thesis 8

1:4 • Authors name here. Format is first name initial followed by full last name (e.g., D. Pineo, C. Ware and S. Fogarty).

When they reach these patches, the scout bees return to the hive and communicate with other bees by doing a "waggle dance" [1].

The dance is an important communication method to other bees in the hive. The feasibility of a given patch is determined by the direction of the patch, the distance of the patch from the hive (determined by the duration of the dance) and the quality rating of the patch (determined by the frequency of the dance). The method of communication can be used to pass information to other agents in a game environment. By controlling the three parameters mentioned, the effectiveness of communication between agents can be controlled. Since improving the parameters for the dance would make them harder to defeat, managing the effectiveness of communication could be used to manage game difficulty.

2.2 Basics of Game Programming

In First-Person Shooting (FPS) games, the player sees the world through the eyes of the main character. The player fights with a variety of weapons, most notably firearms to defeat other players or AI enemies. This type of game genre allows AI to be realized in two requirements. The first requirement is to implement AI actions, such as moving, patrolling, avoiding obstacles, targeting enemies, pursuit and others [6]. The second requirement is to implement the decision-making process of the AI, in order for the AI to choose which actions to perform.

Action, the simpler of the two requirements, is typically implemented as two components. The first component deals with the weapon controller, model shooting, including shot determination like damage (based on range and accuracy) and change in ammo. The other component deals with motion, specifically searching and pursuing enemies.

Searching is commonly handled using an A* algorithm, which Nareyek describes as an "improved version of the Djikstra's shortest-path algorithm" [5]. A* functions by overlaying a graph over the game map. The vertices of the graph are pre-determined waypoints defined by the programmer. The programmer also defines paths between these waypoints as well as their distances. These paths are considered to be the edges of the graph. When an agent moves, it finds the nearest waypoint from its current position. The A* algorithm then finds the shortest path between the starting waypoint and the destination waypoint. That path is passed to the mover and the mover takes the agent to the destination. Once an enemy has been sighted, pursuit is conducted by using the viewing field to maintain vision.

Searching and pursuing targets can be hindered by obstacles. It is impossible for the agent to simply move in a straight line in order to reach the target or destination. To manage this, the agent could simply move to the edge of the obstacle, until the agent can again move straight towards its destination. A better alternative is to use avoidance vectors, which the algorithm associates with each obstacle to the current movement vector as shown in Figure 1. The resulting path will curve around the obstacle, because the goal vector is constantly tracking the goal.

Page 5: Daang Jover Thesis 8

Title of Proposal here • 1:5

Figure 1. Avoidance Vectors

Obstacle avoidance algorithms act as an intermediary between the nodes on an A* graph. The A* gives the mover a set of goals to which the agent can move to, in addition to implementing avoidance vectors when obstacles are in the vicinity.

Decision making, another major component in developing AI, is typically implemented in several ways. The most basic implementation of decision making is using Finite State Machines (FSM). An FSM is a simple and powerful conceptual graph that characterizes the states that the agent can be and describes which states the agent can transition to, and can be implemented with if-then statements [6]. One example of a very simple FSM would be one having two states and three transition conditions, as seen in Figure 2. The problem with FSMs however is that they become unmanageably complex with large and complex AI.

Figure 2. A Simple Finite State Machine (FSM)

Another simpler way of handling decision making is through the use of decision trees. Decision trees are represented through branching structures that allows AI to make high-level strategic decisions [6]. To reach a decision, the AI traverses the tree. The first level in the decision tree is very general and one example of it is the AI thinking whether it should attack or defend. Based on the decision, the AI looks at the appropriate sub-tree. As more decisions are selected, it eventually descends to a leaf node that contains the final decision. Similar with FSMs, decision trees are conceptual tools and can be implemented with if-then statements.

Page 6: Daang Jover Thesis 8

1:6 • Authors name here. Format is first name initial followed by full last name (e.g., D. Pineo, C. Ware and S. Fogarty).

Actual game AIs combine simple and advanced mechanisms to create believable and realistic agents. Game AIs are said to be effective if agents exhibit common sense [7]. This means that AIs must know the right actions as well as how to do them and when to act. To establish common sense, the AI needs to exhibit coherency, transparency, runnability and understandability. Coherency is ensuring that behavior transitions are realistic and avoid dithering, i.e., quick switching among states. Transparency is giving an agent an appearance that matches its current action. For example, if the agent is firing its weapon at the player, the player should be able to see that the agent is pointing the weapon at the player and that the agent is in a shooting stance. Runnability is ensuring that the AI code can complete execution in the processor time allotted for AI decision making. Understandability is making sure that the system is simple enough so that the developer can understand it. Also, the AI should allow for character-to-character variety. Lastly, the AI needs variability so that it would behave differently according to scenes created by the designers in the service of the story.

2.3 AI Case Studies

Halo 2, a FPS game, has a complex AI that uses multiple strategies to achieve realism [7]. The AI system in the game uses a hierarchical FSM (HFSM) combined with a decision-tree based structure called a behavior diagram as shown in Figure 3. An HFSM is an FSM whose levels are prioritized list of states. The AI in Halo 2 processes states according to their priority. Proper ordering of these states ensure that an AI avoid nonsensical behaviors, such as trying to enter a vehicle if they are already in one.

Figure 3. A sample behavior diagram for the Halo 2 AI

Page 7: Daang Jover Thesis 8

Title of Proposal here • 1:7

The problem with HFSMs is the occasional need to dynamically raise the priority of a lower priority action. To allow a lower priority state to override a higher priority state, Halo 2 uses behavior impulses, which are pointers referencing to states in behavior diagrams. One example that needs temporary priority adjustment is directing the AI to enter the vehicle when the player gets in first. Normally, fighting enemies would have a higher priority than finding a vehicle, but the impulse "player_in_vehicle" should be ranked higher than the fighting behavior. The "player_in_vehicle" impulse simply references the "enter_vehicle" code rather than duplicating it. Impulses can be positioned at any level in the behavior diagram.

Behavior impulses can also run arbitrary pieces of code that serve other functions such as logging calls, debugging information or player action cues when certain conditions are met.

When a behavior diagram grows large, determining which behaviors are relevant at a given time takes considerable time. To reduce the time needed to assess behavior relevancy, behaviors are tagged with the states in which they are relevant and temporarily removed from behavior diagrams when they become irrelevant. For instance, when the agent is a gunner in a vehicle, the "throw grenade" behavior could be removed and the retreat option could be removed if they are not the driver.

Another strategy in order to reduce the number of behaviors to check is by using stimulus behaviors, which are behaviors dynamically added and removed from behavior diagrams by an event handler. One example of a stimulus behavior would be a "flee_because_leader_died" stimulus, which can be dynamically added in the behavior diagram by an actor death event handler, and removed after a period of time or when a new leader arrives.

Although different AI in Halo 2 have different behavior properties, AI agents are similar enough to warrant the use of character hierarchies and inheritance to simplify implementation. For example, in Halo 2, grunt majors do more damage and take more damage than regular grunts, but exhibit identical behavior. A grunt major inherits all the characteristics of a grunt, but modifies the damage and vitality statistics.

Another advanced AI system is the Goal Oriented Action Planning (GOAP) system developed by Orkin [8] in an FPS game called F.E.A.R. In place of an elaborate FSM, GOAP searches for actions that meet a goal. This allows a Non-Playable Character (NPC) to handle unexpected situations.

Each agent is divided into sensors, working memory, a real-time planner, a blackboard, and subsystems that manage actions like movement and aiming. Sensors gather information about the environment of the agent. Some sensors are event driven (e.g. recognizing damage) while others poll (e.g. finding tactical positions in the environment). The sensors store information gathered in the working memory of the agent. The real-time planner watches for significant changes to working memory and responds by reevaluating the goals of the agent and strategies for accomplishing those goals. If the goals are altered, the planner adjusts the relevant variables on the blackboard. Finally, the subsystems check the blackboard for changes at a set time interval and make any appropriate changes to their behavior.

There are three advantages of controlling agents using multiple components instead of a single FSM. First, this decouples goals from actions, making it easier to associate different strategies for achieving common goals with different units. The alternative, associating multiple strategies for achieving a common goal with a single FSM, produces extremely complex FSM. Second, this makes it easier to define behavior incrementally, including defining what behaviors are prerequisites for other behaviors and adding new actions late in the development cycle. Allowing the real-time planner to determine the appropriate transitions at run time eliminates the need to work new actions into an FSM. Third, it allows better support for dynamic problem solving. GOAP makes it straightforward to create agents that work through a list of prioritized strategies until they try one that succeeds. This allows for very realistic AI behavior. Orkin [8] gives this example, directly quoted from the reference paper:

Page 8: Daang Jover Thesis 8

1:8 • Authors name here. Format is first name initial followed by full last name (e.g., D. Pineo, C. Ware and S. Fogarty).

"Imagine a scenario where we have a patrolling AI who walks through a door, sees the player, and starts firing. If we run this scenario again, but this time the player physically holds the door shut with his body, we will see the AI try to open the door and fail. He then re-plans and decides to kick the door. When this fails, he re-plans again and decides to drive through the window and ends up close enough to use a melee attack!"

In this example, the agent tries to open the door, but it failed, as did the second option of kicking the door. The agent kept trying different methods until it found one that worked.

In F.E.A.R., agents interact with their environment through the use of smart objects. A smart object is anything the agent could use in order to accomplish a goal. For example, if the goal of the agent is to get to a point on the other side of a closed door (a type of smart object), then the agent would interact with the door to open it. The agent is equipped with sensors, which will detect nearby smart objects and later stored in its working memory. Since some actions may only be available when certain smart objects are present, an agent must reevaluate its goals when new objects are placed into working memory. For example, a weaponless agent chasing the player should pick up an assault rifle when it sees one and continue chasing the player. An agent could use a smart object for defense by flipping a table over and hiding behind it. The benefits of smart objects is that the programmer does not have to script agents into performing actions (e.g. kicking the table). Instead, the agent is the one who does the action to accomplish its goal.

Action planning is done using the A* algorithm. As mentioned previously, the A* algorithm, which was traditionally used for navigating a playing field, provides the agent with the shortest path between the starting and ending positions. The algorithm has been adapted to find the best way to accomplish a goal. To do this, each action is associated with a cost, with higher costs denoting less desirable actions. If a goal is treated as a destination in a graph, possible actions as edges and resulting world states are intermediary nodes, A* can find the most efficient path (i.e., sequence of actions) to reach the goal. If that path fails to accomplish the goal, the edge for the inappropriate action can be removed from the search and the next best path can be found. Paths that A* finds may need to be disqualified because some paths may be unavailable at certain times. Certain actions may only be relevant if the agent satisfies a given condition, such as the agent being in a squad or the agent carrying a firearm.

GOAP also produces ghost behaviors or unintended behaviors that emerge in practice. One example of this was NPC agents looking at distant grenades, due to how NPCs use noise to find the player. Another ghost behavior was NPCs finding points of cover to the side of the player, which gave the appearance of NPCs working together to flank (or ambush) a player. These behaviors emerge in a GOAP system due to unanticipated state transitions. While FSM requires a programmer to tell the agent how to act in the face of danger, GOAP defines the grenade as a danger and the disturbance causes the agent to determine an appropriate or inappropriate (in the case of ghost behaviors) reaction at run time.

Another AI technique used in F.E.A.R. is "fake" AI, or the use of audio and visual cues to suggest agent activity. For example, the game may generate a cue such as "I'm moving. Cover me!" when a squad of soldiers is advancing. Similarly, the game may generate a cue like "Look out!" when a grenade falls near an NPC in a squad after which the NPC tries to escape from the blast. In these instances, the agents are reacting to their environments rather than the cues; they only appear to follow the cues because the cues are in sync with their goals. Another example of "fake" AI is a call by the last member of the squad for reinforcements. While the call has no effect on calling troops, the player will encounter more troops later in the level, making it appear as though the AI responded to the reinforcement request.

Page 9: Daang Jover Thesis 8

Title of Proposal here • 1:9

2.4 Game Balancing

Controlling game difficulty, also called game balancing, is an important gameplay issue. A player cannot enjoy the game with it is too easy or too hard. Static difficulty levels (e.g. Easy, Medium, Hard, etc.) which game developers normally use would make the game too easy or hard. Some game balancing systems can give players or NPCs an unfair advantage. A common example of this is the "rubber band" effect found in may racing games where the last place car is rocketed forward to near the front [9]. Other games use a ramping technique where the game gets progressively harder as it goes on. However, increasing the difficulty faster than the learning curve of the player can frustrate the player. These considerations have led to the study of Dynamic Difficulty Adjustment (DDA).

Difficulty adjustment can take in many forms. A game can switch between different policies for how to challenge players. For example, a game could switch from a comfort policy that attempts to "keep players feeling challenged, but safe [by] padding their inventory" [9] to a discomfort policy that challenges players by limiting item drops. Another way to adjust difficulty is through more direct intervention. Items such as weapons or health packs can be added in the game environment to assist the player. Player statistics such as health and damage could be modified. In the same vein, enemy statistics could be modified to pose a bigger threat to the player. A combination of these methods is normally used to adjust difficulty.

Hunicke [9] has developed a DDA system that regulates game mechanics, dynamics and aesthetics. Mechanics is a trial and error process of continuous exploration, fighting and death, which is common in FPS games. Dynamics is the gradual increase in difficulty as the player progresses through the game. At the same time, resources may become increasingly scarce, so that the player would work harder for them. Finally, aesthetics are how the mechanics and dynamics create difficulty. Increasing the game difficulty as the player progresses is a strategy for maintaining game aesthetics.

Hunicke integrated her system, the "Hamlet System" [9], into the Half-Life game engine owned by Valve. The Hamlet System is divided into two parts: evaluation of player performance and adjusting game settings. The system evaluates players based on the rate at which they lose health. The rate at which a player loses health fits a Gaussian probability distribution shown in Formula 1.

p ( x )= 1σ √2 π

e−(x−μ)2 /2σ 2

Formula 1. Gaussian Probability Distribution for Player Health Loss

"During combat, Hamlet records the damage...each enemy does to the player" [9]. From this data, the Hamlet System can determine the probability the player will die in that encounter. If the probability of the player dying rises above 40%, the Hamlet System intervenes. It increases the health of the player by 15 points every 100 ticks.

Hunicke experimented to see how adjustments affected player performance, if players noticed adjustments, and if adjustments affected "the player's enjoyment, frustration, or perception of game difficulty" [9]. Players playing the unadjusted and adjusted games died an average of 6.4 times and 4.0 times in the first 15 minutes, respectively. If repeated death is equated with frustration, then these adjustments should reduce player frustration. In a short survey given following gameplay, expert players rated the adjusted game more enjoyable, while novice players rated them equally. The survey also found no correlation between player perception of game difficulty and whether the game difficulty was adjusted. This means that the game was made less frustrating and more enjoyable without the player feeling as though the game was "fixed" in their favor. Hunicke concludes that if a small change like

Page 10: Daang Jover Thesis 8

1:10 • Authors name here. Format is first name initial followed by full last name (e.g., D. Pineo, C. Ware and S. Fogarty).

manipulating health could improve a game, then a well designed DDA system has the potential to greatly improve a game.

Another method for handling DDA is proposed by Andrade et al. [10][11]. Andrade et al. integrated their DDA system into a fighting game called Knock'Em similar to the game Mortal Kombat which was created by Midway. Similar to the Hamlet System, the Reinforcement Learning (RL) system proposed by Andrade et al. uses a difficulty calculation to manipulate the game. However, fighting games are different from first-person shooter games like Half-Life. No weapons or health packs are provided in typical fighting games. The authors rejected two strategies for DDA. Dynamic scripting can become too complex in large systems while genetic algorithm techniques do not adapt quickly to player skill.

Reinforcement Learning (RL) is "characterized as the problem of learning what to do (how to map situations into actions) so as to maximize a numeric reward signal" [10]. RL is based on a Markov Decision Process (MDP) involving a series of reward values r(s, a), where an entity receives a reward for an action a in a state s. The RL algorithm attempts to maximize the reward value of an entity by choosing the correct action based on its current state. The algorithm uses memories of past choices and the results of those choices to choose the best action to minimize reward.

The authors discussed two main difficulties with RL. The first is getting the AI to level the skill of the player at the start of the game. To do this, Andrade pre-trained AIs by having them play against themselves to develop basic character policies. Once the agents start playing against the player, they refine their play style to complement the skill and style of the player. The second difficulty is choosing what to do once the optimal policy has been learned. Directing agents to randomly choose actions could result in nonsensical actions (e.g. punching when the opponent is on the other side of the screen). Directing agents to choose only optimal actions would make the agent impossibly difficult. Instead, the AI must choose "progressively sub-optimal actions until the performance of the agent is as good as the performance of the player" [10] or more optimal actions should the game be too easy.

Andrade tested his RL agents against agents that choose random actions and agents that always choose the optimal action. He found that fights normally ended a small difference in health points, "meaning that both fighters had similar performance" [10]. This means that RL AI agents closely match the skill level of the opponents.

2.5 Overview of Bees Algorithm

Bees Algorithm (BA) is an algorithm that simulates how bees forage resources in a given area. The algorithm functions by sending scout bees to random flower patches in a given search space. After searching for all possible flower patches, the scout bees return to the hive, where they will perform a "waggle dance" to pass information to other bees in the hive. As mentioned, the dance provides three kinds of information to other bees: the direction of the flower patch, the distance of the flower patch from the hive and the quality rating (fitness function)

2.6 Section 2 (Replace the heading appropriately.)

Body of Section 2 here.

Page 11: Daang Jover Thesis 8

Title of Proposal here • 1:11

SUBSECTION 1 (As appropriate only)

Body of subsection 1 here.

Tables should appear as follows.Table I. Caption of Table I

If there are numbered listings, this is how the numbered listings should appear.

(1) Item 1(2) Item 2(3) Item 3

If there are bulleted listings, this is how the bulleted listings should appear.

Item 1 Item 2 Item 3

Theorems should appear as follows.

THEOREM 1.1. Description of theorem here.

Formulas should be inserted using an equation editor.

f ( x )=a0+∑n=1

(an cosnπxL

+bn sinnπxL )

Figures should be captioned as follows.

Fig. 1. Caption of figure here.

Pseudocode, prosecode or literate code of algorithms should be presented as follows.

Page 12: Daang Jover Thesis 8

1:12 • Authors name here. Format is first name initial followed by full last name (e.g., D. Pineo, C. Ware and S. Fogarty).

ALGORITHM 1: Iterative Algorithmcurrent_position centercurrent_direction upcurrent_position is inside circlewhile current_position is inside circle, doneighborhood all grid hexes within two hexes from current_positionfor each hex in neighborhood, do for each neuron in hex doconvert neuron_orientation to vectorscale vector by neuron_excitationvector_sum vector_sum + vectorend endnormalize vector_sumcurrent_position current_position + vector_sumcurrent_direction vector_sumreturn current_position end

Description of the algorithm here.

2.7 Section 2 (Replace the heading appropriately.)

Body of Section 2 here.

3. METHODOLOGY

3.1 Section 1 (Replace the heading appropriately.)

Body of Section 1 here.

3.2 Section 2 (Replace the heading appropriately.)

Body of Section 2 here.

4. THEORETICAL BACKGROUND

4.1 Section 1 (Replace the heading appropriately.)

Body of Section 1 here.

4.2 Section 2 (Replace the heading appropriately.)

Body of Section 2 here.

Page 13: Daang Jover Thesis 8

Title of Proposal here • 1:13

REFERENCES

Authors. Book Title. Publisher, City of Publication, Year of Publication.Authors. Book Article Title. in Editors Title of edited book, Publisher, City of Publication, Year of Publication, Pages.Authors. Journal Article Title. Journal or magazine name, Volume (Issue), Pages.Authors, Conference Proceedings Title. in Title of conference, (Location of Conference, Year), Publisher, Pages.