player modeling

Player ModelingChris BostonMike Pollock

Kipp Hickman

Maintains an updated profile of the player◦ Contains the skill, weaknesses, preferences, etc of

each player• Adapts over time to particular players

habits• Most games are “manually adaptive”• Selecting basic difficulty level• No machine learning

What is Player Modeling?

Designing a player model◦ List of numeric values for each trait◦ Usually an array of values between 0 and 1 for

each trait, initialized to 0.5◦ Each represents a single aspect of behavior

(strategy, maneuvers, skills)◦ Chose traits that correlate well with AI actions◦ Balance between fine statistics that will cost a lot

to compute and coarse statistics that don’t represent enough info.

◦ FSM can also be used Decouples player model from game code allowing for

easier tuning

Updating the Player Model Determine when to update the model

◦ Some detection routines can be labor intensive Fix: Keep a log of player actions for later background

computations or try to use info already being computed by AI code

Update the model◦ Least Means Squares algorithm

traitValue = k*observedValue + (1-k)*traitValue K (learning rate) usually 0.1-0.3

Ways of using the player model Predictive – determine how the player will

most likely respond to each AI action Exploitation – finds weaknesses in the

players tactics◦ Can be used for exploitation and or tutorials

Heuristics can be used to measure current difficulty level faced by player then adjust appropriately-if too hard, chose progressively easier actions until desired difficulty reached

Hierarchical player model Abstract traits are calculated from their

related child concrete traits

Preference-based Player Modeling Determine the player’s strategy

◦ A strategy is a preference for certain game states

Two ways of modeling this:◦ Preference Functions◦ Probabilistic Model

Preference Functions Assigns a preference score to game states Inputs are numerical features that represent

a game state, output is a numeric preference score.

Example: V(s) = 5*[player’s health in s] + 10*[magic points in s] – 7*[number of enemies in s]

The weights in the preference function are adjusted with reinforcement learning:◦ If the state predicted does not match with what

the player did, then adjust the weights

Probabilistic Model Player is represented as a mixture of several

predefined player types◦ Like aggressive or defensive

Each player type has an associated preference function

Preference scores are computed based on the probability that a player will adopt one of the player types for that state

The probabilities can be adjusted through learning Problems arise if the player doesn’t match any of

the models

ExampleAggressive type:

V(s) = 4*[player’s health in s] – 8*[number of enemies in s]

Defensive type:V(s) = 8*[player’s health in s] – 8*[number of enemies in s]

Algorithm for finding the optimal move in a zero-sum game

The player tries to maximize the value, while the computer tries to minimize it

The player’s turns are squares, the computer’s are circles

Actions are left or right animated example

Minimax

http://www.ocf.berkeley.edu/~yosenl/extras/alphabeta/alphabeta.html

Basic idea: use preference functions for the values of each node and then apply Minimax

V0 is the computer’s preference function, V1 is the players

PM search

Why? Because Minimax is O(bd) where b is the branching factor and d is the depth

Alphabeta pruning◦ Reduces amount of nodes to search by avoiding

bad branch choices by keeping track of two values (α, β) α = highest value seen for the maximizing player β = smallest value seen for minimizing player

◦ In the best case the complexity will be reduced to O(bd/2), in the worst case it is equivalent to Minimax

◦ animated example

Improving search efficiency

http://www.ocf.berkeley.edu/~yosenl/extras/alphabeta/alphabeta.html

Creating a preference function with the right features and then fine-tuning the weights is difficult

There are slightly better algorithms than Alphabeta pruning, but they are all still exponential in complexity

Problems

Tomb Raider: Underworld Data from 1365 players who completed all levels of TRU Four player types:

◦ Veteran – low death count mostly from environment, fast completion, low to average HOD

◦ Solver – high death count mostly from falling, long completion time, minimal HOD◦ Pacifist – death count varies but mostly from opponents, below average

completion time, minimal HOD◦ Runner – die often mainly from opponents and environment, fast completion time,

varying HOD game statistics used to determine type:

◦ Cause of death Opponent (28.9% average, 6.32% min, 60.86% max) Environment (13.7% average, 2.43% min, 45.31% max) Falling (57.2% average, 27.19% min, 83.33% max)

◦ Total number of deaths (140 average, 16 min, 458 max)◦ Completion time in minutes (550.8 average, 171min, 1738 max)◦ Help on demand (29.4 average, 0 min, 148 max)

http://www.youtube.com/watch?v=HJS-SxgXAI4

http://www.youtube.com/watch?v=HJS-SxgXAI4

K-means clustering algorithm for 0<k<20◦ Mean quantization error for k=3 is 19.06%

and 4 is 13.11%, k>4 is between 7% and 2%

Ward’s clustering method Emergent Self-Organizing Maps(ESOM)

◦ U-matrix: high mountain ranges indicate borders

◦ P-matrix: larger values indicate higher density regions

Tomb Raider Clustering

Six components used to determine characteristics of each group

Uses player modeling to determine how “good” or “evil” a player is.

Alignment: Scalar value from -1 to 1.◦ Players start at 0.◦ Each good action increases it a little.◦ Each evil action decreases it a little.

Black and White

Alignment affects how people react to you, and how the land under your control looks.

Black and White

Alignment also affects which advisor is dominant in giving you advice.

-i.e. If you’re too evil, your good advisor will simply give up.

Black and White

Player Models and Storylines

1. Linear2. Branching3. Player Created4. Layered5. Player Specific

Types of storylines

Storyline is fixed: players can only influence it on a low level.

Linear

All story events happen in sequence every playthrough.

Linear: Half Life 2

Player decisions determine which future events will occur.

Branching

Player actions in a level affect which level you can access next.

Branching: Star Fox 64

Players act within predefined rules to create unique events on each playthrough.

Player Created

Players can create their own species. Certain traits (predatory/herbivorous etc.)

determine future gameplay (warlike/peaceful).

Player Created: Spore

There is one main linear story, but players can complete side quests in different orders.

Layered

Each player may complete different side quests, but the main storyline is the same.

Layered: Oblivion

Story events occur based on what the player would most enjoy.

To do this, we need to model the player’s preferences, then select the best event.

Player Specific

1. Maintain a vector of values that measures a player’s tendency to fit a certain play style.

2. As the player performs actions, increment or decrement appropriate values.

Example Model:

Constructing the player model

Now that we have a model, we can choose a good sequence of events.

Selecting Story Events

Initial Event: Fred has been murdered. Clare is standing near the body,

calling to the player for help. The killer is hiding in the

basement of a nearby mansion.

Example

ExampleConverse with killer:Player discovers the

killer’s side of the story

Subtle Approach:Clare gives the

player a key to the basementHeadlong Approach:

The killer waits in the basement for the player, and attacks on sight

Rate the Potential Events

Multiply player model values with associated branch value to get the weight for that branch.

Selecting the Best Event

Branch 2 (Subtle Approach) is the best event here

Selecting the Best Event

No negatives! Clamp values to zero.

Improving the Selection Process

Branch 3 (Converse with killer) is the best event here

Improving the Selection Process

Evaluating user satisfaction Eight elements of player satisfaction:

◦ Concentration, challenge, player skills, controls, goals, feedback, immersion, social interaction

◦ User satisfaction questionnaires and post-play interviews can be used during development to adjust the player model

Two phases of interaction◦ Learning phase – rapid performance improvement

If beginning is too easy or hard users may give up. Game should quickly adjust to user

◦ Evaluation phase – learning slows and skills become stable AI can expect player model to remain constant and

make better predictions of future events.

Four agent types:◦ FSM (SM)◦ Genetic Learning (GL)◦ Traditional Reinforcement Learning (TRL)◦ Challenge-Sensitive Reinforcement Learning (CSRL)

Learning phase – one of 4 agents randomly chosen to fight player ◦ if player defeats the random agent he/she faces a harder opponent

using Reinforcement Learning with offline training against random agents. Player must defeat this opponent to leave the learning phase.

Evaluation Phase – constant 20 fights, 5 of each agent type Pre-play questionnaire used to determine if player is beginner

or expert Observer watches game play and conducts post-play interview

to gauge satisfaction

Case Study :Knock’Em

2 beginners and 2 experts polled Beginners took 18 and 13 fights to

complete learning phase Experts took 3 to 8 fights 3 players chose CSRL as most

enjoyable, 1 chose GL◦ SM and GL too predictable◦ TRL too difficult

Results

Multiplayer Modeling Purpose: Determine a player’s skill level by

measuring the player’s performance.

Use this model to match players with others of similar skill.

Warcraft III Each player has a level and gains

experience with every win

Players use levels for bragging rights, Battlenet uses them to match players.

Experience Rewards If your opponent is higher level than you…

◦ You gain more experience if you beat him◦ You lose less experience if you are defeated by

him

This causes player levels to converge asymptotically to the value which accurately represents the skill of the player.

Starcraft 2 Will use a similar model with ratings instead

of levels

This model is more detailed, and can be used to determine who is “favored” in a match

Questions

Rabin, Steve, AI Game Programming Wisdom 4.” Course Technology, 2008.



Dynamic Game Balancing: An Evaluation of User SatisfactionGustavo Andrade, Geber Ramalho, and Alex Sandro Gomes, Universidade Federal de Pernambuco; Vincent Corruble, Université Paris 6

Player Modeling using Self-Organization in Tomb Raider: Underworld Anders Drachen, Alessandro Canossa and Georgios N. Yannakakis;

IEEE Symposium on Computational Intelligence and Games Russell, Stuart, and Peter Norvig. Artificial Intelligence A Modern

Approach. 2. Upper Saddle River, New Jersey: Pearson Education, 2003. 165-69.

References

player modeling

Documents

player modeldetermine

player modelpredictive

defensiveeach player

log of player actions

player doesnt match

useddecouples player

zerosum gamethe player

players health