Monte Carlo in LOA
Posted: Thu Dec 30, 2010 2:16 am
Decent academic papers are often hard to find, but this one looks reasonable: Evaluation Function Based Monte-Carlo LOA by Winands and Björnsson. http://www.ru.is/faculty/yngvi/pdf/WinandsB09.pdf
Abstract. Recently, Monte-Carlo Tree Search (MCTS) has advanced the field of computer Go substantially. In the game of Lines of Action (LOA), which has been dominated in the past by alpha-beta [geh, forum doesn't like Greek font], MCTS is making an inroad. In this paper we investigate how to use a positional evaluation function in a Monte-Carlo simulation-based LOA program (MC-LOA). Four different simulation strategies are designed, called Evaluation Cut-Off, Corrective, Greedy, and Mixed, which use an evaluation function in several ways. Experimental results reveal that the Mixed strategy is the best among them. This strategy draws the moves randomly based on their transition probabilities in the first part of a simulation, but selects them based on their evaluation score in the second part of a simulation. Using this simulation strategy the MC-LOA program plays at the same level as the alpha-beta program MIA, the best LOA playing entity in the world.
If you don't know the history, Björnsson created a very good LOA engine > 5 years ago when at Alberta (beating up on the poor students who had to write such for a grad-student class), and then Winands subsequently dethroned him via his thesis work. This paper "is an important milestone for MCTS, because up until now the traditional game-tree search approach has generally been considered to be better suited for LOA, which features both a moderate branching factor and good state evaluators (the best LOA programs use highly sophisticated evaluation functions)."
Abstract. Recently, Monte-Carlo Tree Search (MCTS) has advanced the field of computer Go substantially. In the game of Lines of Action (LOA), which has been dominated in the past by alpha-beta [geh, forum doesn't like Greek font], MCTS is making an inroad. In this paper we investigate how to use a positional evaluation function in a Monte-Carlo simulation-based LOA program (MC-LOA). Four different simulation strategies are designed, called Evaluation Cut-Off, Corrective, Greedy, and Mixed, which use an evaluation function in several ways. Experimental results reveal that the Mixed strategy is the best among them. This strategy draws the moves randomly based on their transition probabilities in the first part of a simulation, but selects them based on their evaluation score in the second part of a simulation. Using this simulation strategy the MC-LOA program plays at the same level as the alpha-beta program MIA, the best LOA playing entity in the world.
If you don't know the history, Björnsson created a very good LOA engine > 5 years ago when at Alberta (beating up on the poor students who had to write such for a grad-student class), and then Winands subsequently dethroned him via his thesis work. This paper "is an important milestone for MCTS, because up until now the traditional game-tree search approach has generally been considered to be better suited for LOA, which features both a moderate branching factor and good state evaluators (the best LOA programs use highly sophisticated evaluation functions)."