Page 1 of 1

Monte Carlo in LOA

Posted: Thu Dec 30, 2010 2:16 am
by BB+
Decent academic papers are often hard to find, but this one looks reasonable: Evaluation Function Based Monte-Carlo LOA by Winands and Björnsson. http://www.ru.is/faculty/yngvi/pdf/WinandsB09.pdf

Abstract. Recently, Monte-Carlo Tree Search (MCTS) has advanced the field of computer Go substantially. In the game of Lines of Action (LOA), which has been dominated in the past by alpha-beta [geh, forum doesn't like Greek font], MCTS is making an inroad. In this paper we investigate how to use a positional evaluation function in a Monte-Carlo simulation-based LOA program (MC-LOA). Four different simulation strategies are designed, called Evaluation Cut-Off, Corrective, Greedy, and Mixed, which use an evaluation function in several ways. Experimental results reveal that the Mixed strategy is the best among them. This strategy draws the moves randomly based on their transition probabilities in the first part of a simulation, but selects them based on their evaluation score in the second part of a simulation. Using this simulation strategy the MC-LOA program plays at the same level as the alpha-beta program MIA, the best LOA playing entity in the world.

If you don't know the history, Björnsson created a very good LOA engine > 5 years ago when at Alberta (beating up on the poor students who had to write such for a grad-student class), and then Winands subsequently dethroned him via his thesis work. This paper "is an important milestone for MCTS, because up until now the traditional game-tree search approach has generally been considered to be better suited for LOA, which features both a moderate branching factor and good state evaluators (the best LOA programs use highly sophisticated evaluation functions)."

Re: Monte Carlo in LOA

Posted: Thu Dec 30, 2010 2:45 am
by BB+
3.3 Parallelization
The parallel version of our MC-LOA program uses the so-called “single-run” parallelization [6], also called root parallelization [8]. It consists of building multiple MCTS trees in parallel, with one thread per tree. These threads do not share information with each other. When the available time is up, all the root children of the separate MCTS trees are merged with their corresponding clones. For each group of clones, the scores of all games played are added. Based on this grand total, the best move is selected. This parallelization method only requires a minimal amount of communication between threads, so the parallelization is easy, even on a cluster. For a small number of threads, root parallelization performs remarkably well in comparison to other parallelization methods [6, 8].

Re: Monte Carlo in LOA

Posted: Thu Dec 30, 2010 5:02 am
by BB+
I think root parallelization did OK up to 16 threads in Go in the above blurbs. A later paper "Evaluating Root Parallelization in Go" (Soejima et al.) describes that 64 it seems to be showing at least the beginnings of some problems. Rocki and Suda have a paper on "Massively Parallel Monte Carlo Tree Search", and by this they mean 1024 cores (64 nodes, each with 8 dual-cores). They also talk about GPUs in passing (as "future work", though a Figure shows the "theoretical maximum" compared to an i7).

Re: Monte Carlo in LOA

Posted: Thu Dec 30, 2010 12:26 pm
by Robert Houdart
Interesting stuff, thanks!

Re: Monte Carlo in LOA

Posted: Fri Dec 31, 2010 3:09 am
by BB+
One annoyance I have with the LOA paper is that is seems almost out-of-date [at least for some purposes] when it was written (published May 2009). Is MIA (the world's best LOA engine) really not even SMP? Somewhat strangely, they then even guess that it would only gain 50% efficiency when passing to 2 threads (so they gave it 50% more time, to try to renormalise). I guess back in 2005-6 when Winands wrapped his thesis, 1 cpu was sufficient to dethrone Mona/YL, and SMP was in its infancy. In any case, in the Monte Carlo search paper they only parallelise up to 4 threads, while the Go researchers are way ahead in pushing the envelope (though perhaps even 1024 cores is not that "large" by academic standards).

There's also a few tendentious comments, like: It is beyond the scope of paper to investigate in how far MCTS scales better than alpha-beta. To give an indication, experiments revealed that for 1 second per move MC-LOA won 42% of the games, whereas for 5 seconds per move MC-LOA already won 46% of the games. Unfortunately, with only 1000 games played in each sample, this is not really all that significant of an indication.