lect3.pdf

8/17/2019 lect3.pdf

1/67

Artificial Intelligence:

1

Game Playing Russell & Norvig: Sections 5.1 & 5.4

Many slides from:robotics.stanford.edu/~latombe/cs121/2003/home.htm

8/17/2019 lect3.pdf

2/67

MotivationGO

2

chess tic-tac-toe

8/17/2019 lect3.pdf

3/67

Today State Space Search for Game Playing

MiniMax

Alpha-beta pruning

Where we are today

3

8/17/2019 lect3.pdf

4/67

State Space Search for Game Playing

Classical application for heuristic search simple games: exhaustibly searchable

complex games: only partial search possible

additional problem: playing against opponent

Type of game: 2 person adversarial games

4

,

Perfect information both players know the state of the game and all possible moves

No chance involved outcome of the game is only dependent on player’s moves

zero-sum game If the total gains of one player are added up, and the total losses are subtracted,

they will sum to zero.

a gain by one player must be matched by a loss by the other player

ex. chess, GO, tic-tac-toe, …

8/17/2019 lect3.pdf

5/67


MiniMax

Alpha-beta pruning Where we are today

5

8/17/2019 lect3.pdf

6/67

MiniMax Search Game between two opponents, MIN and MAX

MAX tries to win, and

MIN tries to minimize MAX’s score Assumption: both players have the same knowledge

Existing heuristic search methods do not work would re uire a hel ul o onent

6

Need to incorporate “hostile” moves into search strategy

MiniMax procedure: label each level according to player’s move

leaves get a value of 1 or 0 (win for MAX or MIN) propagate this value up:

if parent=MAX, give it max value of children

if parent=MIN, give it min value of children

8/17/2019 lect3.pdf

7/67

Example: Game of Nim Rules

2 players start with a pile of tokens

move: split (any) existing pile into two non-emptydifferently-sized piles

game ends when no pile can be unevenly split

7

player who cannot make his move loses

8/17/2019 lect3.pdf

8/67

State Space of Game Nim start with one pile of tokens

each step has to divide one pileof tokens into 2 non-empty pilesof different size

player without a move left losesgame

8source: G. Luger (2005)

8/17/2019 lect3.pdf

9/67

MiniMax Algorithm Label nodes as MIN or MAX, alternating for each level

Define utility function (payoff function)

to determine outcome of a game e.g., (0, 1) or (-1, 0, 1)

Do full search on tree (expand all nodes until game is overfor each branch

9

Label leaves according to outcome Propagate result up the tree with

M(n) = max( child nodes ) for a MAX node

M(n) = min( child nodes ) for a MIN node

Best next move for MAX is the one leading to the child withthe highest value (and vice versa for MIN)

8/17/2019 lect3.pdf

10/67

Exhaustive MiniMax for Nim

10

Bold lines indicateforced win for MAX

source: G. Luger (2005)

8/17/2019 lect3.pdf

11/67

MiniMax with Heuristic Exhaustive search for interesting games is rarelyfeasible

Search only to predefined level called n-ply look-ahead

n is number of levels

No exhaustive search

11

nodes evaluated with heuristics and not win/loss indicates best state that can be reached

horizon effect

Games with opponent simple strategy: try to maximize difference between

players using a heuristic function e(n)

8/17/2019 lect3.pdf

12/67

Heuristic Function e(n) is a heuristic that estimates how favorable a

node n is for MAX e(n) > 0 --> n is favorable to MAX

e(n) < 0 --> n is favorable to MIN

e n = 0 --> n is neutral

12

8/17/2019 lect3.pdf

13/67

MiniMax with Fixed Ply Depth

13

Leaf nodes show the actual heuristic value e(n)Internal nodes show back-up heuristic value

source: G. Luger (2005)

8/17/2019 lect3.pdf

14/67

Choosing an Evaluation Function e(n)

14

8/17/2019 lect3.pdf

15/67

Example: e(n) for Tic-Tac-Toe

Possible e(n) e(n) = number of rows, columns, and diagonals open for MAX

- number of rows, columns, and diagonals open for MIN

e(n) =+∞, if n is a forced win for MAX

e(n) =-∞, if n is a forced win for MIN

e(n) = 8−8 = 0 e(n) = 6−4 = 2 e(n) = 3−3 = 0

15

8/17/2019 lect3.pdf

16/67

More examples…


8/17/2019 lect3.pdf

17/67

Two-ply MiniMax for Opening Move

Tic-Tac-Toe tree

at horizon = 2


8/17/2019 lect3.pdf

18/67

Two-ply MiniMax: MAX’s possible 2nd moves


8/17/2019 lect3.pdf

19/67

Two-ply minimax: MAX’s move at end


8/17/2019 lect3.pdf

20/67


MiniMax

Alpha-beta pruning Where we are today

20

8/17/2019 lect3.pdf

21/67

Alpha-Beta Pruning

Optimization over minimax, that: ignores (cuts off, prunes) branches of the tree

that cannot possibly lead to a better solution reduces branching factor

allows deeper search with same effort

21

8/17/2019 lect3.pdf

22/67

Alpha-Beta Pruning: Example 1 With minimax, we look at all possibilities nodes at the n-ply

depth

With α-β pruning, we ignore branches that could not

possibly contribute to the final decisionB will be >= 5So we can ignore B’s rightbranch, because A must be 3

22

D will be = 3

So we can ignore D’s rightbranch

E will be

8/17/2019 lect3.pdf

23/67

A closer look

-

≥≥≥≥ 3

min level

max level≤≤≤≤ -1 A

-1

← Pruningnode C will not contribute its

value to node A… so this partof the tree can’t have anyeffect on the value that will be

backed up to node A

23

8/17/2019 lect3.pdf

24/67

Alpha-Beta Pruning Algorithm Start with depth-first search down to n-ply depth

apply heuristic e(n) to a state and its siblings

Select min or max and propagate result to parent as in minimax

Offer result to grand-parent node as potential cutoff

If it is a:

MAX node, you keep a value called alpha

24

es g es c o ce or on pa never ecreases

MIN node, you keep a value called beta best (lowest) choice for MIN on path (never increases)

Prune search tree

alpha cutoff: prune tree below MIN node, if value ≤ alpha

beta cutoff: prune tree below MAX node, if value ≥ beta order sensitive

8/17/2019 lect3.pdf

25/67

Example with tic-tac-toe

max level

25

8/17/2019 lect3.pdf

26/67


The beta value of a MIN

max level

β = 2

e(n) = 2

the final backed-up value.It can never increase

26

value ≤ 2

8/17/2019 lect3.pdf

27/67


The beta value of a MIN

max level

the final backed-up value.It can never increase

e(n) = 1

β = 2 1

e(n) = 2

27

value ≤≤≤≤ 2 1

8/17/2019 lect3.pdf

28/67


α = 1

value ≥ 1

The alpha value of a MAXnode is a lower bound on

the final backed-up value.

It can never decrease

e(n) = 1

β = 1value = 1

e(n) = 2

28

8/17/2019 lect3.pdf

29/67


α = 1

value ≥ 1

max level

e(n) = 1

β = 1value = 1

e(n) = 2 e(n) = -1

= -value ≤≤≤≤ -1

29

8/17/2019 lect3.pdf

30/67


α = 1

value ≥ 1 and value ≤≤≤≤ -1

incompatible…so stop searching the right branch;the value cannot come from there!

incompatible…so stop searching the right branch;the value cannot come from there!

e(n) = 1

β = 1

e(n) = 2 e(n) = -1

= -value ≤≤≤≤ -1Search can be discontinued below

any MIN node whose beta value isless than or equal to the alpha value

of one of its MAX ancestors

Search can be discontinued belowany MIN node whose beta value isless than or equal to the alpha value

of one of its MAX ancestors

30

8/17/2019 lect3.pdf

31/67

Alpha-Beta Pruning: Example 2

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

31

8/17/2019 lect3.pdf

32/67


0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

32

8/17/2019 lect3.pdf

33/67


0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0

33

8/17/2019 lect3.pdf

34/67


0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3

34

8/17/2019 lect3.pdf

35/67


0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3

35

8/17/2019 lect3.pdf

36/67


0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3

36

8/17/2019 lect3.pdf

37/67


0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

37

8/17/2019 lect3.pdf

38/67


0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

38

8/17/2019 lect3.pdf

39/67


0

0

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

39

8/17/2019 lect3.pdf

40/67


0

0

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

5

40

8/17/2019 lect3.pdf

41/67


0

0

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

41

8/17/2019 lect3.pdf

42/67


0

0

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

42

8/17/2019 lect3.pdf

43/67


0

0

0

2

2

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

43

8/17/2019 lect3.pdf

44/67


0

0

0

2

2

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

44

8/17/2019 lect3.pdf

45/67


0

0

0

2

2

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

45

l h E l

8/17/2019 lect3.pdf

46/67


0

0

0

2

2

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

5

46

l h B P E l 2

8/17/2019 lect3.pdf

47/67


0

0

0

2

2

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

47

Al h B P i E l 2

8/17/2019 lect3.pdf

48/67


0

0

0

2

2

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3

48

Al h B t P i E l 2

8/17/2019 lect3.pdf

49/67


0

0

0

2

2

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3

49

Al h B t P i E l 2

8/17/2019 lect3.pdf

50/67


0

0

0

2

2

1

1

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3

50

Al h B t P i E l 2

8/17/2019 lect3.pdf

51/67


0

0

0

2

2

1

1

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

51

Al h B t P i : E l 2

8/17/2019 lect3.pdf

52/67


0

0

0

2

2

1

1

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

52

Alpha Beta Prunin : Example 2

8/17/2019 lect3.pdf

53/67


0

0

0

2

2

1

1

-5

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

-5

53

Alpha Beta Pruning: Example 2

8/17/2019 lect3.pdf

54/67


0

0

0

2

2

1

1

-5

0

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

-5

54


8/17/2019 lect3.pdf

55/67


0

0

0

2

2

1

1

-5

0

1

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

-5

55


8/17/2019 lect3.pdf

56/67


0

0

0

2

2

1

1

-5 2

2

1

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

-5

2

2

56


8/17/2019 lect3.pdf

57/67


0

0

0

2

2

1

1

-5

1

2

2

1

0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

0

0 -3 3

3

2

2

1

1

-3 -5

-5

2

2

57


8/17/2019 lect3.pdf

58/67


58

source: http://en.wikipedia.org/wiki/File:AB_pruning.svg

Today

8/17/2019 lect3.pdf

59/67


MiniMax

Alpha-beta pruning

Where we are today

59

Checkers: Tinsley vs Chinook

8/17/2019 lect3.pdf

60/67

Checkers: Tinsley vs. ChinookMarion Tinsley

World championfor over 40 years

Chinook

VS

2007: Chinook won the world championship

Jonathan Schaeffer,professor at the U. of Alberta,

60

Chess: Kasparov vs Deep Blue

8/17/2019 lect3.pdf

61/67

Chess: Kasparov vs. Deep Blue

Garry Kasparov50 billion neurons

2 positions/sec

VSDeep Blue

32 RISC processors

+ 256 VLSI chess engines

200,000,000 pos/sec

1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws

61

Chess: Kasparov vs Deep Junior

8/17/2019 lect3.pdf

62/67

Chess: Kasparov vs. Deep Junior

Garry Kasparovstill 50 billion neurons

still 2 positions/sec

VS

2003: Match ends in a 3/3 tie!

Deep Junior8 CPU, 8 GB RAM, Win 2000

2,000,000 pos/sec

Available at $100

62

Othello: Murakami vs Logistello

8/17/2019 lect3.pdf

63/67

Othello: Murakami vs. Logistello

Takeshi MurakamiWorld Othello champion

VS

Lo istello

1997: Logistello beat Murakami by 6 games to 0

developed by Michael Buro

runs on a standard PC

https://skatgame.net/mburo/log.html

(including source code)

63

Go: Goemate

8/17/2019 lect3.pdf

64/67

Go: Goemate

?

VS

Goematethe best Go ro ram available toda

Developed by Chen Zhixing

Go has too high a branching factor forexisting search techniques

Programs must rely on huge databasesand pattern-recognition techniques

Go has too high a branching factor forexisting search techniques

Programs must rely on huge databasesand pattern-recognition techniques

64

Secrets

8/17/2019 lect3.pdf

65/67

Secrets

Many game programs are based on: alpha-beta pruning +

iterative deepening + huge databases + ...

For instance, Chinook searched all checkersconfigurations with 8 pieces or less and created anendgame database of 444 billion board configurations

65

Perspective on Games: Con and Pro

8/17/2019 lect3.pdf

66/67

p

Chess is the Drosophila ofartificial intelligence. However,computer chess has developedmuch as genetics might have if

Saying Deep Blue doesn’treally think about chessis like saying an airplanedoesn't really fly because

the geneticists had concentrated

their efforts starting in 1910 onbreeding racing Drosophila. Wewould have some science, butmainly we would have very fast

fruit flies.John McCarthy

it doesn't flap its wings.

Drew McDermott

66

Today

8/17/2019 lect3.pdf

67/67

y State Space Search for Game Playing

MiniMax

Alpha-beta pruning

Where we are today

67

lect3.pdf

Documents