player modeling
DESCRIPTION
Player Modeling. Chris Boston Mike Pollock Kipp Hickman. What is Player Modeling?. Maintains an updated profile of the player Contains the skill, weaknesses, preferences, etc of each player Adapts over time to particular players habits Most games are “manually adaptive” - PowerPoint PPT PresentationTRANSCRIPT
Player ModelingChris BostonMike Pollock
Kipp Hickman
Maintains an updated profile of the player◦ Contains the skill, weaknesses, preferences, etc of
each player• Adapts over time to particular players
habits• Most games are “manually adaptive”• Selecting basic difficulty level• No machine learning
What is Player Modeling?
Designing a player model◦ List of numeric values for each trait◦ Usually an array of values between 0 and 1 for
each trait, initialized to 0.5◦ Each represents a single aspect of behavior
(strategy, maneuvers, skills)◦ Chose traits that correlate well with AI actions◦ Balance between fine statistics that will cost a lot
to compute and coarse statistics that don’t represent enough info.
◦ FSM can also be used Decouples player model from game code allowing for
easier tuning
Updating the Player Model Determine when to update the model
◦ Some detection routines can be labor intensive Fix: Keep a log of player actions for later background
computations or try to use info already being computed by AI code
Update the model◦ Least Means Squares algorithm
traitValue = k*observedValue + (1-k)*traitValue K (learning rate) usually 0.1-0.3
Ways of using the player model Predictive – determine how the player will
most likely respond to each AI action Exploitation – finds weaknesses in the
players tactics◦ Can be used for exploitation and or tutorials
Heuristics can be used to measure current difficulty level faced by player then adjust appropriately-if too hard, chose progressively easier actions until desired difficulty reached
Hierarchical player model Abstract traits are calculated from their
related child concrete traits
Preference-based Player Modeling Determine the player’s strategy
◦ A strategy is a preference for certain game states
Two ways of modeling this:◦ Preference Functions◦ Probabilistic Model
Preference Functions Assigns a preference score to game states Inputs are numerical features that represent
a game state, output is a numeric preference score.
Example: V(s) = 5*[player’s health in s] + 10*[magic points in s] – 7*[number of enemies in s]
The weights in the preference function are adjusted with reinforcement learning:◦ If the state predicted does not match with what
the player did, then adjust the weights
Probabilistic Model Player is represented as a mixture of several
predefined player types◦ Like aggressive or defensive
Each player type has an associated preference function
Preference scores are computed based on the probability that a player will adopt one of the player types for that state
The probabilities can be adjusted through learning Problems arise if the player doesn’t match any of
the models
ExampleAggressive type:
V(s) = 4*[player’s health in s] – 8*[number of enemies in s]
Defensive type:V(s) = 8*[player’s health in s] – 8*[number of enemies in s]
Algorithm for finding the optimal move in a zero-sum game
The player tries to maximize the value, while the computer tries to minimize it
The player’s turns are squares, the computer’s are circles
Actions are left or right animated example
Minimax
Basic idea: use preference functions for the values of each node and then apply Minimax
V0 is the computer’s preference function, V1 is the players
PM search
Why? Because Minimax is O(bd) where b is the branching factor and d is the depth
Alphabeta pruning◦ Reduces amount of nodes to search by avoiding
bad branch choices by keeping track of two values (α, β) α = highest value seen for the maximizing player β = smallest value seen for minimizing player
◦ In the best case the complexity will be reduced to O(bd/2), in the worst case it is equivalent to Minimax
◦ animated example
Improving search efficiency
Creating a preference function with the right features and then fine-tuning the weights is difficult
There are slightly better algorithms than Alphabeta pruning, but they are all still exponential in complexity
Problems
Tomb Raider: Underworld Data from 1365 players who completed all levels of TRU Four player types:
◦ Veteran – low death count mostly from environment, fast completion, low to average HOD
◦ Solver – high death count mostly from falling, long completion time, minimal HOD◦ Pacifist – death count varies but mostly from opponents, below average
completion time, minimal HOD◦ Runner – die often mainly from opponents and environment, fast completion time,
varying HOD game statistics used to determine type:
◦ Cause of death Opponent (28.9% average, 6.32% min, 60.86% max) Environment (13.7% average, 2.43% min, 45.31% max) Falling (57.2% average, 27.19% min, 83.33% max)
◦ Total number of deaths (140 average, 16 min, 458 max)◦ Completion time in minutes (550.8 average, 171min, 1738 max)◦ Help on demand (29.4 average, 0 min, 148 max)
http://www.youtube.com/watch?v=HJS-SxgXAI4
K-means clustering algorithm for 0<k<20◦ Mean quantization error for k=3 is 19.06%
and 4 is 13.11%, k>4 is between 7% and 2%
Ward’s clustering method Emergent Self-Organizing Maps(ESOM)
◦ U-matrix: high mountain ranges indicate borders
◦ P-matrix: larger values indicate higher density regions
Tomb Raider Clustering
Six components used to determine characteristics of each group
Uses player modeling to determine how “good” or “evil” a player is.
Alignment: Scalar value from -1 to 1.◦ Players start at 0.◦ Each good action increases it a little.◦ Each evil action decreases it a little.
Black and White
Alignment affects how people react to you, and how the land under your control looks.
Black and White
Alignment also affects which advisor is dominant in giving you advice.
-i.e. If you’re too evil, your good advisor will simply give up.
Black and White
Player Models and Storylines
1. Linear2. Branching3. Player Created4. Layered5. Player Specific
Types of storylines
Storyline is fixed: players can only influence it on a low level.
Linear
All story events happen in sequence every playthrough.
Linear: Half Life 2
Player decisions determine which future events will occur.
Branching
Player actions in a level affect which level you can access next.
Branching: Star Fox 64
Players act within predefined rules to create unique events on each playthrough.
Player Created
Players can create their own species. Certain traits (predatory/herbivorous etc.)
determine future gameplay (warlike/peaceful).
Player Created: Spore
There is one main linear story, but players can complete side quests in different orders.
Layered
Each player may complete different side quests, but the main storyline is the same.
Layered: Oblivion
Story events occur based on what the player would most enjoy.
To do this, we need to model the player’s preferences, then select the best event.
Player Specific
1. Maintain a vector of values that measures a player’s tendency to fit a certain play style.
2. As the player performs actions, increment or decrement appropriate values.
Example Model:
Constructing the player model
Now that we have a model, we can choose a good sequence of events.
Selecting Story Events
Initial Event: Fred has been murdered. Clare is standing near the body,
calling to the player for help. The killer is hiding in the
basement of a nearby mansion.
Example
ExampleConverse with killer:Player discovers the
killer’s side of the story
Subtle Approach:Clare gives the
player a key to the basementHeadlong Approach:
The killer waits in the basement for the player, and attacks on sight
Rate the Potential Events
Multiply player model values with associated branch value to get the weight for that branch.
Selecting the Best Event
Branch 2 (Subtle Approach) is the best event here
Selecting the Best Event
No negatives! Clamp values to zero.
Improving the Selection Process
Branch 3 (Converse with killer) is the best event here
Improving the Selection Process
Evaluating user satisfaction Eight elements of player satisfaction:
◦ Concentration, challenge, player skills, controls, goals, feedback, immersion, social interaction
◦ User satisfaction questionnaires and post-play interviews can be used during development to adjust the player model
Two phases of interaction◦ Learning phase – rapid performance improvement
If beginning is too easy or hard users may give up. Game should quickly adjust to user
◦ Evaluation phase – learning slows and skills become stable AI can expect player model to remain constant and
make better predictions of future events.
Four agent types:◦ FSM (SM)◦ Genetic Learning (GL)◦ Traditional Reinforcement Learning (TRL)◦ Challenge-Sensitive Reinforcement Learning (CSRL)
Learning phase – one of 4 agents randomly chosen to fight player ◦ if player defeats the random agent he/she faces a harder opponent
using Reinforcement Learning with offline training against random agents. Player must defeat this opponent to leave the learning phase.
Evaluation Phase – constant 20 fights, 5 of each agent type Pre-play questionnaire used to determine if player is beginner
or expert Observer watches game play and conducts post-play interview
to gauge satisfaction
Case Study :Knock’Em
2 beginners and 2 experts polled Beginners took 18 and 13 fights to
complete learning phase Experts took 3 to 8 fights 3 players chose CSRL as most
enjoyable, 1 chose GL◦ SM and GL too predictable◦ TRL too difficult
Results
Multiplayer Modeling Purpose: Determine a player’s skill level by
measuring the player’s performance.
Use this model to match players with others of similar skill.
Warcraft III Each player has a level and gains
experience with every win
Players use levels for bragging rights, Battlenet uses them to match players.
Experience Rewards If your opponent is higher level than you…
◦ You gain more experience if you beat him◦ You lose less experience if you are defeated by
him
This causes player levels to converge asymptotically to the value which accurately represents the skill of the player.
Starcraft 2 Will use a similar model with ratings instead
of levels
This model is more detailed, and can be used to determine who is “favored” in a match
Questions
Rabin, Steve, AI Game Programming Wisdom 4.” Course Technology, 2008.
Rabin, Steve, AI Game Programming Wisdom 3.” Course Technology, 2006.
Rabin, Steve, AI Game Programming Wisdom 2.” Course Technology, 2004.
Dynamic Game Balancing: An Evaluation of User SatisfactionGustavo Andrade, Geber Ramalho, and Alex Sandro Gomes, Universidade Federal de Pernambuco; Vincent Corruble, Université Paris 6
Player Modeling using Self-Organization in Tomb Raider: Underworld Anders Drachen, Alessandro Canossa and Georgios N. Yannakakis;
IEEE Symposium on Computational Intelligence and Games Russell, Stuart, and Peter Norvig. Artificial Intelligence A Modern
Approach. 2. Upper Saddle River, New Jersey: Pearson Education, 2003. 165-69.
References