Efficient Selectivity and Backup
Operators in Monte-Carlo Tree Search
Presented by Melvin Zhang
Remi Coulom, Computer and Games 2006
SelectionPrinciple: Move that look best should be searched deeper, andbad moves should be searched less
SelectionPrinciple: Move that look best should be searched deeper, andbad moves should be searched less
Crazy Stone: Each move i is selected with probabilityproportional to
exp
(−2.4
µ∗ − µi√2(σ2
∗ + σ2i )
)+ εi
Backup/Update
Update the statistics at internal nodes base on result ofsimulation. Affects subsequent selection.
Backup/Update
Update the statistics at internal nodes base on result ofsimulation. Affects subsequent selection.
Crazy Stone: Uses “Mix” operator, linear combination of meanand robust max
meanaverage of child nodes
maxmax of child nodes
robust maxchild node with the most simulation