lect3.pdf
TRANSCRIPT
-
8/17/2019 lect3.pdf
1/67
Artificial Intelligence:
1
Game Playing Russell & Norvig: Sections 5.1 & 5.4
Many slides from:robotics.stanford.edu/~latombe/cs121/2003/home.htm
-
8/17/2019 lect3.pdf
2/67
MotivationGO
2
chess tic-tac-toe
-
8/17/2019 lect3.pdf
3/67
Today State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Where we are today
3
-
8/17/2019 lect3.pdf
4/67
State Space Search for Game Playing
Classical application for heuristic search simple games: exhaustibly searchable
complex games: only partial search possible
additional problem: playing against opponent
Type of game: 2 person adversarial games
4
,
Perfect information both players know the state of the game and all possible moves
No chance involved outcome of the game is only dependent on player’s moves
zero-sum game If the total gains of one player are added up, and the total losses are subtracted,
they will sum to zero.
a gain by one player must be matched by a loss by the other player
ex. chess, GO, tic-tac-toe, …
-
8/17/2019 lect3.pdf
5/67
Today State Space Search for Game Playing
MiniMax
Alpha-beta pruning Where we are today
5
-
8/17/2019 lect3.pdf
6/67
MiniMax Search Game between two opponents, MIN and MAX
MAX tries to win, and
MIN tries to minimize MAX’s score Assumption: both players have the same knowledge
Existing heuristic search methods do not work would re uire a hel ul o onent
6
Need to incorporate “hostile” moves into search strategy
MiniMax procedure: label each level according to player’s move
leaves get a value of 1 or 0 (win for MAX or MIN) propagate this value up:
if parent=MAX, give it max value of children
if parent=MIN, give it min value of children
-
8/17/2019 lect3.pdf
7/67
Example: Game of Nim Rules
2 players start with a pile of tokens
move: split (any) existing pile into two non-emptydifferently-sized piles
game ends when no pile can be unevenly split
7
player who cannot make his move loses
-
8/17/2019 lect3.pdf
8/67
State Space of Game Nim start with one pile of tokens
each step has to divide one pileof tokens into 2 non-empty pilesof different size
player without a move left losesgame
8source: G. Luger (2005)
-
8/17/2019 lect3.pdf
9/67
MiniMax Algorithm Label nodes as MIN or MAX, alternating for each level
Define utility function (payoff function)
to determine outcome of a game e.g., (0, 1) or (-1, 0, 1)
Do full search on tree (expand all nodes until game is overfor each branch
9
Label leaves according to outcome Propagate result up the tree with
M(n) = max( child nodes ) for a MAX node
M(n) = min( child nodes ) for a MIN node
Best next move for MAX is the one leading to the child withthe highest value (and vice versa for MIN)
-
8/17/2019 lect3.pdf
10/67
Exhaustive MiniMax for Nim
10
Bold lines indicateforced win for MAX
source: G. Luger (2005)
-
8/17/2019 lect3.pdf
11/67
MiniMax with Heuristic Exhaustive search for interesting games is rarelyfeasible
Search only to predefined level called n-ply look-ahead
n is number of levels
No exhaustive search
11
nodes evaluated with heuristics and not win/loss indicates best state that can be reached
horizon effect
Games with opponent simple strategy: try to maximize difference between
players using a heuristic function e(n)
-
8/17/2019 lect3.pdf
12/67
Heuristic Function e(n) is a heuristic that estimates how favorable a
node n is for MAX e(n) > 0 --> n is favorable to MAX
e(n) < 0 --> n is favorable to MIN
e n = 0 --> n is neutral
12
-
8/17/2019 lect3.pdf
13/67
MiniMax with Fixed Ply Depth
13
Leaf nodes show the actual heuristic value e(n)Internal nodes show back-up heuristic value
source: G. Luger (2005)
-
8/17/2019 lect3.pdf
14/67
Choosing an Evaluation Function e(n)
14
-
8/17/2019 lect3.pdf
15/67
Example: e(n) for Tic-Tac-Toe
Possible e(n) e(n) = number of rows, columns, and diagonals open for MAX
- number of rows, columns, and diagonals open for MIN
e(n) =+∞, if n is a forced win for MAX
e(n) =-∞, if n is a forced win for MIN
e(n) = 8−8 = 0 e(n) = 6−4 = 2 e(n) = 3−3 = 0
15
-
8/17/2019 lect3.pdf
16/67
More examples…
16source: G. Luger (2005)
-
8/17/2019 lect3.pdf
17/67
Two-ply MiniMax for Opening Move
Tic-Tac-Toe tree
at horizon = 2
17source: G. Luger (2005)
-
8/17/2019 lect3.pdf
18/67
Two-ply MiniMax: MAX’s possible 2nd moves
18source: G. Luger (2005)
-
8/17/2019 lect3.pdf
19/67
Two-ply minimax: MAX’s move at end
19source: G. Luger (2005)
-
8/17/2019 lect3.pdf
20/67
Today State Space Search for Game Playing
MiniMax
Alpha-beta pruning Where we are today
20
-
8/17/2019 lect3.pdf
21/67
Alpha-Beta Pruning
Optimization over minimax, that: ignores (cuts off, prunes) branches of the tree
that cannot possibly lead to a better solution reduces branching factor
allows deeper search with same effort
21
-
8/17/2019 lect3.pdf
22/67
Alpha-Beta Pruning: Example 1 With minimax, we look at all possibilities nodes at the n-ply
depth
With α-β pruning, we ignore branches that could not
possibly contribute to the final decisionB will be >= 5So we can ignore B’s rightbranch, because A must be 3
22
D will be = 3
So we can ignore D’s rightbranch
E will be
-
8/17/2019 lect3.pdf
23/67
A closer look
-
≥≥≥≥ 3
min level
max level≤≤≤≤ -1 A
-1
← Pruningnode C will not contribute its
value to node A… so this partof the tree can’t have anyeffect on the value that will be
backed up to node A
23
-
8/17/2019 lect3.pdf
24/67
Alpha-Beta Pruning Algorithm Start with depth-first search down to n-ply depth
apply heuristic e(n) to a state and its siblings
Select min or max and propagate result to parent as in minimax
Offer result to grand-parent node as potential cutoff
If it is a:
MAX node, you keep a value called alpha
24
es g es c o ce or on pa never ecreases
MIN node, you keep a value called beta best (lowest) choice for MIN on path (never increases)
Prune search tree
alpha cutoff: prune tree below MIN node, if value ≤ alpha
beta cutoff: prune tree below MAX node, if value ≥ beta order sensitive
-
8/17/2019 lect3.pdf
25/67
Example with tic-tac-toe
max level
25
-
8/17/2019 lect3.pdf
26/67
Example with tic-tac-toe
The beta value of a MIN
max level
β = 2
e(n) = 2
the final backed-up value.It can never increase
26
value ≤ 2
-
8/17/2019 lect3.pdf
27/67
Example with tic-tac-toe
The beta value of a MIN
max level
the final backed-up value.It can never increase
e(n) = 1
β = 2 1
e(n) = 2
27
value ≤≤≤≤ 2 1
-
8/17/2019 lect3.pdf
28/67
Example with tic-tac-toe
α = 1
value ≥ 1
The alpha value of a MAXnode is a lower bound on
the final backed-up value.
It can never decrease
e(n) = 1
β = 1value = 1
e(n) = 2
28
-
8/17/2019 lect3.pdf
29/67
Example with tic-tac-toe
α = 1
value ≥ 1
max level
e(n) = 1
β = 1value = 1
e(n) = 2 e(n) = -1
= -value ≤≤≤≤ -1
29
-
8/17/2019 lect3.pdf
30/67
Example with tic-tac-toe
α = 1
value ≥ 1 and value ≤≤≤≤ -1
incompatible…so stop searching the right branch;the value cannot come from there!
incompatible…so stop searching the right branch;the value cannot come from there!
e(n) = 1
β = 1
e(n) = 2 e(n) = -1
= -value ≤≤≤≤ -1Search can be discontinued below
any MIN node whose beta value isless than or equal to the alpha value
of one of its MAX ancestors
Search can be discontinued belowany MIN node whose beta value isless than or equal to the alpha value
of one of its MAX ancestors
30
-
8/17/2019 lect3.pdf
31/67
Alpha-Beta Pruning: Example 2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
31
-
8/17/2019 lect3.pdf
32/67
Alpha-Beta Pruning: Example 2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
32
-
8/17/2019 lect3.pdf
33/67
Alpha-Beta Pruning: Example 2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0
33
-
8/17/2019 lect3.pdf
34/67
Alpha-Beta Pruning: Example 2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3
34
-
8/17/2019 lect3.pdf
35/67
Alpha-Beta Pruning: Example 2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3
35
-
8/17/2019 lect3.pdf
36/67
Alpha-Beta Pruning: Example 2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3
36
-
8/17/2019 lect3.pdf
37/67
Alpha-Beta Pruning: Example 2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
37
-
8/17/2019 lect3.pdf
38/67
Alpha-Beta Pruning: Example 2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
38
-
8/17/2019 lect3.pdf
39/67
Alpha-Beta Pruning: Example 2
0
0
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
39
-
8/17/2019 lect3.pdf
40/67
Alpha-Beta Pruning: Example 2
0
0
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
5
40
-
8/17/2019 lect3.pdf
41/67
Alpha-Beta Pruning: Example 2
0
0
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
41
-
8/17/2019 lect3.pdf
42/67
Alpha-Beta Pruning: Example 2
0
0
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
42
-
8/17/2019 lect3.pdf
43/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
43
-
8/17/2019 lect3.pdf
44/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
44
-
8/17/2019 lect3.pdf
45/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
45
l h E l
-
8/17/2019 lect3.pdf
46/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
5
46
l h B P E l 2
-
8/17/2019 lect3.pdf
47/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
47
Al h B P i E l 2
-
8/17/2019 lect3.pdf
48/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3
48
Al h B t P i E l 2
-
8/17/2019 lect3.pdf
49/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3
49
Al h B t P i E l 2
-
8/17/2019 lect3.pdf
50/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3
50
Al h B t P i E l 2
-
8/17/2019 lect3.pdf
51/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
51
Al h B t P i : E l 2
-
8/17/2019 lect3.pdf
52/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
52
Alpha Beta Prunin : Example 2
-
8/17/2019 lect3.pdf
53/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
-5
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
-5
53
Alpha Beta Pruning: Example 2
-
8/17/2019 lect3.pdf
54/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
-5
0
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
-5
54
Alpha Beta Pruning: Example 2
-
8/17/2019 lect3.pdf
55/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
-5
0
1
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
-5
55
Alpha Beta Pruning: Example 2
-
8/17/2019 lect3.pdf
56/67
Alpha-Beta Pruning: Example 21
0
0
0
2
2
1
1
-5 2
2
1
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
-5
2
2
56
Alpha Beta Pruning: Example 2
-
8/17/2019 lect3.pdf
57/67
Alpha-Beta Pruning: Example 2
0
0
0
2
2
1
1
-5
1
2
2
1
0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35
0
0 -3 3
3
2
2
1
1
-3 -5
-5
2
2
57
Alpha Beta Pruning: Example 3
-
8/17/2019 lect3.pdf
58/67
Alpha-Beta Pruning: Example 3
58
source: http://en.wikipedia.org/wiki/File:AB_pruning.svg
Today
-
8/17/2019 lect3.pdf
59/67
Today State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Where we are today
59
Checkers: Tinsley vs Chinook
-
8/17/2019 lect3.pdf
60/67
Checkers: Tinsley vs. ChinookMarion Tinsley
World championfor over 40 years
Chinook
VS
2007: Chinook won the world championship
Jonathan Schaeffer,professor at the U. of Alberta,
60
Chess: Kasparov vs Deep Blue
-
8/17/2019 lect3.pdf
61/67
Chess: Kasparov vs. Deep Blue
Garry Kasparov50 billion neurons
2 positions/sec
VSDeep Blue
32 RISC processors
+ 256 VLSI chess engines
200,000,000 pos/sec
1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws
61
Chess: Kasparov vs Deep Junior
-
8/17/2019 lect3.pdf
62/67
Chess: Kasparov vs. Deep Junior
Garry Kasparovstill 50 billion neurons
still 2 positions/sec
VS
2003: Match ends in a 3/3 tie!
Deep Junior8 CPU, 8 GB RAM, Win 2000
2,000,000 pos/sec
Available at $100
62
Othello: Murakami vs Logistello
-
8/17/2019 lect3.pdf
63/67
Othello: Murakami vs. Logistello
Takeshi MurakamiWorld Othello champion
VS
Lo istello
1997: Logistello beat Murakami by 6 games to 0
developed by Michael Buro
runs on a standard PC
https://skatgame.net/mburo/log.html
(including source code)
63
Go: Goemate
-
8/17/2019 lect3.pdf
64/67
Go: Goemate
?
VS
Goematethe best Go ro ram available toda
Developed by Chen Zhixing
Go has too high a branching factor forexisting search techniques
Programs must rely on huge databasesand pattern-recognition techniques
Go has too high a branching factor forexisting search techniques
Programs must rely on huge databasesand pattern-recognition techniques
64
Secrets
-
8/17/2019 lect3.pdf
65/67
Secrets
Many game programs are based on: alpha-beta pruning +
iterative deepening + huge databases + ...
For instance, Chinook searched all checkersconfigurations with 8 pieces or less and created anendgame database of 444 billion board configurations
65
Perspective on Games: Con and Pro
-
8/17/2019 lect3.pdf
66/67
p
Chess is the Drosophila ofartificial intelligence. However,computer chess has developedmuch as genetics might have if
Saying Deep Blue doesn’treally think about chessis like saying an airplanedoesn't really fly because
the geneticists had concentrated
their efforts starting in 1910 onbreeding racing Drosophila. Wewould have some science, butmainly we would have very fast
fruit flies.John McCarthy
it doesn't flap its wings.
Drew McDermott
66
Today
-
8/17/2019 lect3.pdf
67/67
y State Space Search for Game Playing
MiniMax
Alpha-beta pruning
Where we are today
67