lect3.pdf

Upload: erjaimin89

Post on 06-Jul-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 lect3.pdf

    1/67

    Artificial Intelligence:

    1

     Game Playing Russell & Norvig: Sections 5.1 & 5.4

    Many slides from:robotics.stanford.edu/~latombe/cs121/2003/home.htm

  • 8/17/2019 lect3.pdf

    2/67

    MotivationGO

    2

    chess tic-tac-toe

  • 8/17/2019 lect3.pdf

    3/67

    Today State Space Search for Game Playing

    MiniMax

    Alpha-beta pruning

    Where we are today

    3

  • 8/17/2019 lect3.pdf

    4/67

    State Space Search for Game Playing

    Classical application for heuristic search simple games: exhaustibly searchable

    complex games: only partial search possible

    additional problem: playing against opponent

    Type of game: 2 person adversarial games

    4

    ,

    Perfect information both players know the state of the game and all possible moves

    No chance involved outcome of the game is only dependent on player’s moves

    zero-sum game If the total gains of one player are added up, and the total losses are subtracted,

    they will sum to zero.

    a gain by one player must be matched by a loss by the other player

    ex. chess, GO, tic-tac-toe, …

  • 8/17/2019 lect3.pdf

    5/67

    Today State Space Search for Game Playing

    MiniMax

    Alpha-beta pruning Where we are today

    5

  • 8/17/2019 lect3.pdf

    6/67

    MiniMax Search Game between two opponents, MIN and MAX

    MAX tries to win, and

    MIN tries to minimize MAX’s score Assumption: both players have the same knowledge

    Existing heuristic search methods do not work would re uire a hel ul o onent

    6

     

    Need to incorporate “hostile” moves into search strategy

    MiniMax procedure: label each level according to player’s move

    leaves get a value of 1 or 0 (win for MAX or MIN) propagate this value up:

    if parent=MAX, give it max value of children

    if parent=MIN, give it min value of children

  • 8/17/2019 lect3.pdf

    7/67

    Example: Game of Nim Rules

    2 players start with a pile of tokens

    move: split (any) existing pile into two non-emptydifferently-sized piles

    game ends when no pile can be unevenly split

    7

     

    player who cannot make his move loses

  • 8/17/2019 lect3.pdf

    8/67

    State Space of Game Nim start with one pile of tokens

    each step has to divide one pileof tokens into 2 non-empty pilesof different size

    player without a move left losesgame

    8source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    9/67

    MiniMax Algorithm Label nodes as MIN or MAX, alternating for each level

    Define utility function (payoff function)

    to determine outcome of a game e.g., (0, 1) or (-1, 0, 1)

    Do full search on tree (expand all nodes until game is overfor each branch

    9

     

    Label leaves according to outcome Propagate result up the tree with

    M(n) = max( child nodes ) for a MAX node

    M(n) = min( child nodes ) for a MIN node

    Best next move for MAX is the one leading to the child withthe highest value (and vice versa for MIN)

  • 8/17/2019 lect3.pdf

    10/67

    Exhaustive MiniMax for Nim

    10

    Bold lines indicateforced win for MAX

    source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    11/67

    MiniMax with Heuristic Exhaustive search for interesting games is rarelyfeasible

    Search only to predefined level called n-ply look-ahead

    n is number of levels

    No exhaustive search

    11

      nodes evaluated with heuristics and not win/loss indicates best state that can be reached

    horizon effect

    Games with opponent simple strategy: try to maximize difference between

    players using a heuristic function e(n)

  • 8/17/2019 lect3.pdf

    12/67

    Heuristic Function e(n) is a heuristic that estimates how favorable a

    node n is for MAX e(n) > 0 --> n is favorable to MAX

    e(n) < 0 --> n is favorable to MIN

    e n = 0 --> n is neutral

    12

  • 8/17/2019 lect3.pdf

    13/67

    MiniMax with Fixed Ply Depth

    13

    Leaf nodes show the actual heuristic value e(n)Internal nodes show back-up heuristic value

    source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    14/67

    Choosing an Evaluation Function e(n)

    14

  • 8/17/2019 lect3.pdf

    15/67

    Example: e(n) for Tic-Tac-Toe

    Possible e(n) e(n) = number of rows, columns, and diagonals open for MAX

    - number of rows, columns, and diagonals open for MIN

    e(n) =+∞, if n is a forced win for MAX

    e(n) =-∞, if n is a forced win for MIN

    e(n) = 8−8 = 0 e(n) = 6−4 = 2 e(n) = 3−3 = 0

    15

  • 8/17/2019 lect3.pdf

    16/67

    More examples…

    16source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    17/67

    Two-ply MiniMax for Opening Move

    Tic-Tac-Toe tree

    at horizon = 2

    17source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    18/67

    Two-ply MiniMax: MAX’s possible 2nd moves

    18source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    19/67

    Two-ply minimax: MAX’s move at end

    19source: G. Luger (2005)

  • 8/17/2019 lect3.pdf

    20/67

    Today State Space Search for Game Playing

    MiniMax

    Alpha-beta pruning Where we are today

    20

  • 8/17/2019 lect3.pdf

    21/67

    Alpha-Beta Pruning

    Optimization over minimax, that: ignores (cuts off, prunes) branches of the tree

    that cannot possibly lead to a better solution reduces branching factor

    allows deeper search with same effort

    21

  • 8/17/2019 lect3.pdf

    22/67

    Alpha-Beta Pruning: Example 1 With minimax, we look at all possibilities nodes at the n-ply

    depth

    With α-β pruning, we ignore branches that could not

    possibly contribute to the final decisionB will be >= 5So we can ignore B’s rightbranch, because A must be 3

     

    22

    D will be = 3

    So we can ignore D’s rightbranch

    E will be

  • 8/17/2019 lect3.pdf

    23/67

    A closer look

    -

    ≥≥≥≥ 3

    min level

    max level≤≤≤≤ -1 A

    -1

    ← Pruningnode C will not contribute its

    value to node A… so this partof the tree can’t have anyeffect on the value that will be

    backed up to node A

    23

  • 8/17/2019 lect3.pdf

    24/67

    Alpha-Beta Pruning Algorithm Start with depth-first search down to n-ply depth

    apply heuristic e(n) to a state and its siblings

    Select min or max and propagate result to parent as in minimax

    Offer result to grand-parent node as potential cutoff

    If it is a:

    MAX node, you keep a value called alpha 

    24

    es g es c o ce or on pa never ecreases

    MIN node, you keep a value called beta best (lowest) choice for MIN on path (never increases)

    Prune search tree

    alpha cutoff: prune tree below MIN node, if value ≤ alpha

    beta cutoff: prune tree below MAX node, if value ≥ beta order sensitive

  • 8/17/2019 lect3.pdf

    25/67

    Example with tic-tac-toe

     

    max level

    25

     

  • 8/17/2019 lect3.pdf

    26/67

    Example with tic-tac-toe

     

    The beta value of a MIN 

    max level

     

    β = 2

    e(n) = 2

     

    the final backed-up value.It can never increase

    26

     

    value ≤ 2

  • 8/17/2019 lect3.pdf

    27/67

    Example with tic-tac-toe

    The beta value of a MIN 

    max level

     

    the final backed-up value.It can never increase

    e(n) = 1

    β = 2 1

    e(n) = 2

    27

    value ≤≤≤≤ 2 1

     

  • 8/17/2019 lect3.pdf

    28/67

    Example with tic-tac-toe

    α = 1

    value ≥ 1

    The alpha value of a MAXnode is a lower bound on

    the final backed-up value.

    It can never decrease

     

    e(n) = 1

    β = 1value = 1

    e(n) = 2

    28

     

  • 8/17/2019 lect3.pdf

    29/67

    Example with tic-tac-toe

    α = 1

    value ≥ 1

     

    max level

    e(n) = 1

    β = 1value = 1

    e(n) = 2 e(n) = -1

    = -value ≤≤≤≤ -1

    29

     

  • 8/17/2019 lect3.pdf

    30/67

    Example with tic-tac-toe

    α = 1

    value ≥ 1 and value ≤≤≤≤ -1

     

    incompatible…so stop searching the right branch;the value cannot come from there!

    incompatible…so stop searching the right branch;the value cannot come from there!

    e(n) = 1

    β = 1

    e(n) = 2 e(n) = -1

    = -value ≤≤≤≤ -1Search can be discontinued below

    any MIN node whose beta value isless than or equal to the alpha value

    of one of its MAX ancestors

    Search can be discontinued belowany MIN node whose beta value isless than or equal to the alpha value

    of one of its MAX ancestors

    30

  • 8/17/2019 lect3.pdf

    31/67

    Alpha-Beta Pruning: Example 2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    31

  • 8/17/2019 lect3.pdf

    32/67

    Alpha-Beta Pruning: Example 2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    32

  • 8/17/2019 lect3.pdf

    33/67

    Alpha-Beta Pruning: Example 2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0

    33

  • 8/17/2019 lect3.pdf

    34/67

    Alpha-Beta Pruning: Example 2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3

    34

  • 8/17/2019 lect3.pdf

    35/67

    Alpha-Beta Pruning: Example 2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3

    35

  • 8/17/2019 lect3.pdf

    36/67

    Alpha-Beta Pruning: Example 2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3

    36

  • 8/17/2019 lect3.pdf

    37/67

    Alpha-Beta Pruning: Example 2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    37

  • 8/17/2019 lect3.pdf

    38/67

    Alpha-Beta Pruning: Example 2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    38

  • 8/17/2019 lect3.pdf

    39/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    39

  • 8/17/2019 lect3.pdf

    40/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    5

    40

  • 8/17/2019 lect3.pdf

    41/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    41

  • 8/17/2019 lect3.pdf

    42/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    42

  • 8/17/2019 lect3.pdf

    43/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    43

  • 8/17/2019 lect3.pdf

    44/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    44

  • 8/17/2019 lect3.pdf

    45/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    45

    l h E l

  • 8/17/2019 lect3.pdf

    46/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    5

    46

    l h B P E l 2

  • 8/17/2019 lect3.pdf

    47/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    47

    Al h B P i E l 2

  • 8/17/2019 lect3.pdf

    48/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3

    48

    Al h B t P i E l 2

  • 8/17/2019 lect3.pdf

    49/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3

    49

    Al h B t P i E l 2

  • 8/17/2019 lect3.pdf

    50/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3

    50

    Al h B t P i E l 2

  • 8/17/2019 lect3.pdf

    51/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    51

    Al h B t P i : E l 2

  • 8/17/2019 lect3.pdf

    52/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    52

    Alpha Beta Prunin : Example 2

  • 8/17/2019 lect3.pdf

    53/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    -5

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    -5

    53

    Alpha Beta Pruning: Example 2

  • 8/17/2019 lect3.pdf

    54/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    -5

    0

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    -5

    54

    Alpha Beta Pruning: Example 2

  • 8/17/2019 lect3.pdf

    55/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    -5

    0

    1

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    -5

    55

    Alpha Beta Pruning: Example 2

  • 8/17/2019 lect3.pdf

    56/67

    Alpha-Beta Pruning: Example 21

    0

    0

    0

    2

    2

    1

    1

    -5 2

    2

    1

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    -5

    2

    2

    56

    Alpha Beta Pruning: Example 2

  • 8/17/2019 lect3.pdf

    57/67

    Alpha-Beta Pruning: Example 2

    0

    0

    0

    2

    2

    1

    1

    -5

    1

    2

    2

    1

    0 5 -3 25-2 32-3 033 -501 -350 1-55 3 2-35

    0

    0   -3 3

    3

    2

    2

    1

    1

    -3 -5

    -5

    2

    2

    57

    Alpha Beta Pruning: Example 3

  • 8/17/2019 lect3.pdf

    58/67

    Alpha-Beta Pruning: Example 3

    58

    source: http://en.wikipedia.org/wiki/File:AB_pruning.svg

    Today

  • 8/17/2019 lect3.pdf

    59/67

    Today State Space Search for Game Playing

    MiniMax

    Alpha-beta pruning

    Where we are today

    59

    Checkers: Tinsley vs Chinook

  • 8/17/2019 lect3.pdf

    60/67

    Checkers: Tinsley vs. ChinookMarion Tinsley

    World championfor over 40 years

    Chinook 

    VS

    2007: Chinook won the world championship

     

    Jonathan Schaeffer,professor at the U. of Alberta,

    60

    Chess: Kasparov vs Deep Blue

  • 8/17/2019 lect3.pdf

    61/67

    Chess: Kasparov vs. Deep Blue

    Garry Kasparov50 billion neurons

    2 positions/sec

    VSDeep Blue

    32 RISC processors

    + 256 VLSI chess engines

    200,000,000 pos/sec

    1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws

    61

    Chess: Kasparov vs Deep Junior

  • 8/17/2019 lect3.pdf

    62/67

    Chess: Kasparov vs. Deep Junior

    Garry Kasparovstill 50 billion neurons

    still 2 positions/sec

    VS

    2003: Match ends in a 3/3 tie!

    Deep Junior8 CPU, 8 GB RAM, Win 2000

    2,000,000 pos/sec

    Available at $100

    62

    Othello: Murakami vs Logistello

  • 8/17/2019 lect3.pdf

    63/67

    Othello: Murakami vs. Logistello

    Takeshi MurakamiWorld Othello champion

    VS

    Lo istello

    1997: Logistello beat Murakami by 6 games to 0

    developed by Michael Buro

    runs on a standard PC

    https://skatgame.net/mburo/log.html

    (including source code) 

    63

    Go: Goemate

  • 8/17/2019 lect3.pdf

    64/67

    Go: Goemate

    ?

    VS

    Goematethe best Go ro ram available toda 

    Developed by Chen Zhixing

    Go has too high a branching factor forexisting search techniques

    Programs must rely on huge databasesand pattern-recognition techniques

    Go has too high a branching factor forexisting search techniques

    Programs must rely on huge databasesand pattern-recognition techniques

    64

    Secrets

  • 8/17/2019 lect3.pdf

    65/67

    Secrets

    Many game programs are based on: alpha-beta pruning +

    iterative deepening + huge databases + ...

    For instance, Chinook searched all checkersconfigurations with 8 pieces or less and created anendgame database of 444 billion board configurations

    65

    Perspective on Games: Con and Pro

  • 8/17/2019 lect3.pdf

    66/67

    p

    Chess is the Drosophila ofartificial intelligence. However,computer chess has developedmuch as genetics might have if

    Saying Deep Blue doesn’treally think about chessis like saying an airplanedoesn't really fly because

    the geneticists had concentrated

    their efforts starting in 1910 onbreeding racing Drosophila. Wewould have some science, butmainly we would have very fast

    fruit flies.John McCarthy

    it doesn't flap its wings.

    Drew McDermott

    66

    Today

  • 8/17/2019 lect3.pdf

    67/67

    y State Space Search for Game Playing

    MiniMax

    Alpha-beta pruning

    Where we are today

    67