I'm pretty happy that my program has no serious bugs in it now: it doesn't crash and it's managed a few draws against gnuchess. I also think I have all the logic of interfacing with a gui sorted in my head -- I have set it going online through xboard a few times. So I've been thinking about tuning.
Over the last few days, I've started to try to implement this kind of algorithm: https://chessprogramming.wikispaces.com ... ing+Method.
I have added some code to my engine that I can activate with a compilation flag, so that I can give it a FEN string, and it will return the score from the quiescence function. To calculate the evaluation error, I have downloaded a load of games in pgn format and extracted the FEN positions from them with the results, so I have a few million positions that I can use.
I have written a Python script that calls my engine as a subprocess and sends the FEN strings to it, and then calculates the evaluation error as per the link. My intention was that I was going to refactor my engine to put all the evaluation parameters into a single file (I already did most of this in anticipation), do a #include to include the file, then use optimization procedures in SciPy to optimize the result, automatically rewrite the file and then recompile for the next iteration of the optimization procedure.
My problem is that, unless I have misunderstood, each call to the error function requires a *lot* of positions -- in that article, it mentions 8.8 million. I don't have exact timings yet, but the speed that it takes to evaluate the positions is too slow. It also doesn't scale linearly with the number of positions. So it takes roughly 15 seconds to evaluate 10^4 positions in quiescence, but 10 minutes to evaluate 10^5. I got sick of waiting for 10^6 positions to complete and went to the pub, but it had completed when I got back. I would have expected it to scale roughly linearly. This is on a single cpu core at present.
If it takes that long for one call to the error function, the optimization procedure will take, like forever, especially trying to optimize hundreds of parameters.
I'm also not certain how I'm going to optimize my piece square tables, because obviously each one has 64 parameters, and I have one for beginning and end games, so that's 768 parameters to start with which must be optimized simultaneously with all the others, unless I implement some kind of analytic function based on fewer parameters.
I've read of orthogonality in relation to chess evaluation: none of it is orthogonal!
Tuning
-
- Posts: 190
- Joined: Sun Jul 14, 2013 10:00 am
- Real Name: H.G. Muller
Re: Tuning
It is very suspect that your engine slows down with the number of positions it has already searched. This suggests there is something horribly wrong. Do you also see that effect if you let it go through the same set of 10^4 positions repeatedly?
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Tuning
ppyvabw wrote:I'm pretty happy that my program has no serious bugs in it now: it doesn't crash and it's managed a few draws against gnuchess. I also think I have all the logic of interfacing with a gui sorted in my head -- I have set it going online through xboard a few times. So I've been thinking about tuning.
Over the last few days, I've started to try to implement this kind of algorithm: https://chessprogramming.wikispaces.com ... ing+Method.
I have added some code to my engine that I can activate with a compilation flag, so that I can give it a FEN string, and it will return the score from the quiescence function. To calculate the evaluation error, I have downloaded a load of games in pgn format and extracted the FEN positions from them with the results, so I have a few million positions that I can use.
I have written a Python script that calls my engine as a subprocess and sends the FEN strings to it, and then calculates the evaluation error as per the link. My intention was that I was going to refactor my engine to put all the evaluation parameters into a single file (I already did most of this in anticipation), do a #include to include the file, then use optimization procedures in SciPy to optimize the result, automatically rewrite the file and then recompile for the next iteration of the optimization procedure.
My problem is that, unless I have misunderstood, each call to the error function requires a *lot* of positions -- in that article, it mentions 8.8 million. I don't have exact timings yet, but the speed that it takes to evaluate the positions is too slow. It also doesn't scale linearly with the number of positions. So it takes roughly 15 seconds to evaluate 10^4 positions in quiescence, but 10 minutes to evaluate 10^5. I got sick of waiting for 10^6 positions to complete and went to the pub, but it had completed when I got back. I would have expected it to scale roughly linearly. This is on a single cpu core at present.
If it takes that long for one call to the error function, the optimization procedure will take, like forever, especially trying to optimize hundreds of parameters.
I'm also not certain how I'm going to optimize my piece square tables, because obviously each one has 64 parameters, and I have one for beginning and end games, so that's 768 parameters to start with which must be optimized simultaneously with all the others, unless I implement some kind of analytic function based on fewer parameters.
I've read of orthogonality in relation to chess evaluation: none of it is orthogonal!
Are you sure you are understanding what the training is all about? It is about feeding in 8.8M DIFFERENT FEN strings and doing a very short (shallow q search) search on each one. Not doing huge searches on a single FEN position. A qsearch on a single position should take microseconds at max, 8.8M x a microsecond is 8 seconds...
Re: Tuning
In Rookie I communicate between tuner through stdin/stdout which gives overhead indeed. I do ("did", this was all in 2010), 2-ply searches to make the best of the communication overhead. I also did it in batch mode: [generateInput--> engine --> processOutput] and not in back and forth mode [tuner <===> engine]. That way at least there is some I/O buffering to mitigate the pain.
In Floyd I can compile the engine core into a Python module which I can import into the tuner. With that I can comfortably do qSearches for tuning because all is running in the same address space.
The second program is open source (and comes with the tuner, which works fine but is not very advanced). https://github.com/kervinck/floyd
In Floyd I can compile the engine core into a Python module which I can import into the tuner. With that I can comfortably do qSearches for tuning because all is running in the same address space.
The second program is open source (and comes with the tuner, which works fine but is not very advanced). https://github.com/kervinck/floyd