Rebel wrote:Hi Bob, long time no see, nice to meet you here. I hope this forum will be a new fresh start to talk in peace about computer chess in all its aspects.
hyatt wrote: (2) If you look at the crafty source, I have a "phase" variable that tells me what phase of the move selection I am in, from "HASH_MOVE" to "CAPTURE_MOVES" to "KILLER_MOVES" to "REMAINING_MOVES". I do not reduce moves until I get to REMAINING_MOVES (effectively the L (late) in LMR. So for me, there is no extra SEE calls. I have already used SEE to choose which captures are searched in CAPTURE_MOVES, leaving the rest for REMAINING_MOVES. I therefore reduce anything in REMAINING_MOVES (except for moves that give check). So there is really no extra SEE usage at all.
I stopped developing mine some years ago, if memory serves me well my exclusion list is as follows:
1) No LMR in the last 3 plies of the search, this because of the use of futility pruning;
2) Always search at least 3 moves;
3) Hash-move;
4) Captures, guess I have to try your idea skipping bad captures;
5) Queen promotions (no minors);
6) Extended moves;
7) Moves that give check;
8) Moves that escape from check;
9) Static Mate-threads;
10) Killer moves;
Now lend me your cluster
as I have understood the key of making progress is to play at least 30.000+ eng-eng games. Perhaps you can elaborate a bit on the latter, I am bit out-of-date the last years but I do find the latest developments still fascinating.
Regards,
Ed
Hi Ed...
my basic approach is to play a varied set of positions, two games per position against a set of opponents that are reliable on our cluster (thing has 128 nodes, each node has 2 cpus, other cluster has 70 nodes, each node has 8 cpus). 30K games gives me an error bar of +/-3 Elo using BayesElo on the complete PGN. Many changes we make are just a few Elo up or down. Some are 10-20 but these are rarer and those could be detected with fewer games. But doing this, Crafty's actual Elo has gone up by almost 300 points in 2 years. Where we were lucky to get 40 in previous years because it is so easy to make a change that sounds good in theory, and which looks good in particular positions, but which hurts overall.
Using the 256 cpu cluster, playing fast games of 10 seconds on clock + 0.1 second increment, we can complete an entire 30,000 game match in about an hour. Which means we have a very accurate answer about whether this was good or bad. And while not in real-time, we can be working on the next change while testing the last change so it is pretty efficient. We occasionally play longer games, and I have done 60+60 once (60 minutes on clock 60 secs increment) which took almost 5 weeks to complete. Fortunately testing has shown that almost all changes can be measured equally well at fast or slow time controls, only a few react differently given more or less time.
My starting set of positions were chosen by using a high-quality PGN game collection, going thru each game one at a time and writing out the FEN when it it white's turn to move, move number 12, one position per game. These were then sorted by popularity to get rid of dups, and then the first 5,000 or so were kept. We are currently using 3,000 positions, where each game alternates colors so two games per position, and we use opponents including Stockfish, fruit, toga, etc...
I just make a change, do a profile-guided compile, and run the test and look at the results in an hour or so.