Copy Board vs Unmake Move
Copy Board vs Unmake Move
Apologies if this is a little basic for this forum: I am fairy new to this and am trying to get my head around why most chess engines unmake moves, rather than using memory copy during the search. I understand (vagely) that it is broadly to do with memory access latencies, but with more modern processors with large caches sizes is this still such a big issue? Can anyone point me to any discussions/resources on this point? (I couldn't seem to find much on simplistic search criteria) Thanks.
- Bo Persson
- Posts: 14
- Joined: Thu Jun 10, 2010 10:34 am
- Real Name: Bo Persson
- Location: Malmö, Sweden
Re: Copy Board vs Unmake Move
Which is most efficient depends a lot on how large a state your program keeps. You can either use copy-and-discard, or make-unmake. The amount of memory traffic produced by each method, determines what will be the fastest.
A make-unmake might keep the data in the same cache lines for a longer time, while not having an unmake function obviously saves some execution time. "The fastest way of doing something, is not doing it at all."
Like almost always, there is no general answer.
A make-unmake might keep the data in the same cache lines for a longer time, while not having an unmake function obviously saves some execution time. "The fastest way of doing something, is not doing it at all."
Like almost always, there is no general answer.
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Copy Board vs Unmake Move
ChrisJ wrote:Apologies if this is a little basic for this forum: I am fairy new to this and am trying to get my head around why most chess engines unmake moves, rather than using memory copy during the search. I understand (vagely) that it is broadly to do with memory access latencies, but with more modern processors with large caches sizes is this still such a big issue? Can anyone point me to any discussions/resources on this point? (I couldn't seem to find much on simplistic search criteria) Thanks.
Measure the size of your "state". board. any pointers you need. hash stuff. Etc. Look at what MakeMove() modifies. Once you know that number, you have the minimum amount of memory needed in L1/L2/L3 cache to hold that. Then multiply that by 64 (or whatever ply limit you impose on your search to limit max reachable depth. For copy/make, you need that much memory, which will likely stress L1 pretty heavily. That tends to generate lots of memory traffic (remember, you copy one complete state for every move you search, in the case of crafty on a decent 8-core box that is 30M copy/makes per second. a ton of memory bandwidth required.)
Make/Unmake executes more code, but it is the _same_ code for each ply, so you are not shuffling stuff in and out of cache regularly. More work, but less memory traffic is almost always a win since memory is so slow compared to the speed of the CPU. On 64 bit cpus, with the 4-level virtual-to-real memory addressing hardware, this can turn into a nightmare to access a lot of memory.
Re: Copy Board vs Unmake Move
Thanks for these replies - I must become more familiar with memory caching vs processing speeds- that said, I am not close to finalising my board state reqirements: to include incremental evaluation criteria, facilitate further optimisation of move generation code etc - so don't know how big its going to end up. However, it seems pretty straigtforward to replace the make/unmake method with a copy/make/discard method to see (and measure) the impact as the engine development progresses. It sounds, though, that its likely to be a clear win to define arrays for and apply copy/make/discard to the 'irreversible' state data (ep sq, half-move count, castle flags) - along with the 64bit Z Hash Key...
Re: Copy Board vs Unmake Move
My engine uses CopyBoard() because of laziness. In a new engine it's much easier to copy (essentially one line of code) than to write a complex and bug-prone Unmake().
Careful profiling shows that time is spent:
15.7% MakeMove
5.5% CopyBoard
77.9% MoveGen, Evaluate, Search,...
I doubt that I am able to write an Unmake() faster than CopyBoard(); it would have to be 3x faster than MakeMove() just to break even. Thus laziness, incompetence and performance all seem to be saying that for my engine "copy is best". YMMV.
For reference, this is 64-bit on Core i5. Board struct is 728 bytes. Speed is approximately 2.5 Mnps single threaded.
Careful profiling shows that time is spent:
15.7% MakeMove
5.5% CopyBoard
77.9% MoveGen, Evaluate, Search,...
I doubt that I am able to write an Unmake() faster than CopyBoard(); it would have to be 3x faster than MakeMove() just to break even. Thus laziness, incompetence and performance all seem to be saying that for my engine "copy is best". YMMV.
For reference, this is 64-bit on Core i5. Board struct is 728 bytes. Speed is approximately 2.5 Mnps single threaded.
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Copy Board vs Unmake Move
Note that you are not just fighting for the 5.5% in CopyBoard(), but you will also speed up MakeMove() a good bit, as well as the reset of the program that runs into cache conflicts because of that copy operation. In Crafty, Make/Unmake is < 10% of total time...RobertP wrote:My engine uses CopyBoard() because of laziness. In a new engine it's much easier to copy (essentially one line of code) than to write a complex and bug-prone Unmake().
Careful profiling shows that time is spent:
15.7% MakeMove
5.5% CopyBoard
77.9% MoveGen, Evaluate, Search,...
I doubt that I am able to write an Unmake() faster than CopyBoard(); it would have to be 3x faster than MakeMove() just to break even. Thus laziness, incompetence and performance all seem to be saying that for my engine "copy is best". YMMV.
For reference, this is 64-bit on Core i5. Board struct is 728 bytes. Speed is approximately 2.5 Mnps single threaded.
Re: Copy Board vs Unmake Move
Until now, my Board struct has contained an 'attack board'
BitBoard attacksFrom[64];
as a distant legacy from Slate & Atkin's Chess 4.5.
Caching attacks in this way not only inflates Board size, it requires elaborate code for incremental updating in MakeMove().
Stimulated by this thread, I eliminated attacksFrom[64], thereby simplifying MakeMove() and reducing the Board struct to 216 bytes. Attacks from a given square (when required by MoveGen or Evaluate) are now generated on the fly; this seems to be normal practice nowadays.
I get a small but useful 8% speedup, part of which is no doubt due to a reduction in cache traffic. CopyBoard() does one third of the work it used to do.
The profile looks like this:
9.2% MakeMove
3.7% CopyBoard
86.5% MoveGen, Evaluate, Search,...
BitBoard attacksFrom[64];
as a distant legacy from Slate & Atkin's Chess 4.5.
Caching attacks in this way not only inflates Board size, it requires elaborate code for incremental updating in MakeMove().
Stimulated by this thread, I eliminated attacksFrom[64], thereby simplifying MakeMove() and reducing the Board struct to 216 bytes. Attacks from a given square (when required by MoveGen or Evaluate) are now generated on the fly; this seems to be normal practice nowadays.
I get a small but useful 8% speedup, part of which is no doubt due to a reduction in cache traffic. CopyBoard() does one third of the work it used to do.
The profile looks like this:
9.2% MakeMove
3.7% CopyBoard
86.5% MoveGen, Evaluate, Search,...
-
- Posts: 144
- Joined: Sun Jun 13, 2010 7:32 am
- Contact:
Re: Copy Board vs Unmake Move
ChrisJ wrote:unmake moves, rather than using memory copy during the search
Actually there is a third way of performing unmake(). Just allocate the necessary memory at once. Smth likeRobertP wrote:Until now, my Board struct has contained an 'attack board'
BitBoard attacksFrom[64];
as a distant legacy from Slate & Atkin's Chess 4.5.
Caching attacks in this way not only inflates Board size, it requires elaborate code for incremental updating in MakeMove().
Code: Select all
typedef struct {
BitBoard attacksFrom[64];
} PositionInfo;
#define MaxPly 128
PositionInfo Info[MaxPly];
PositionInfo * CurrentInfo = Info;
void make() {
CurrentInfo++;
* * *
// copy/update bishop/rook/queen attacks:
* * *
}
void unmake() {
* * *
CurrentInfo--;
}
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Copy Board vs Unmake Move
ThinkingALot wrote:ChrisJ wrote:unmake moves, rather than using memory copy during the searchActually there is a third way of performing unmake(). Just allocate the necessary memory at once. Smth likeRobertP wrote:Until now, my Board struct has contained an 'attack board'
BitBoard attacksFrom[64];
as a distant legacy from Slate & Atkin's Chess 4.5.
Caching attacks in this way not only inflates Board size, it requires elaborate code for incremental updating in MakeMove().Code: Select all
typedef struct { BitBoard attacksFrom[64]; } PositionInfo; #define MaxPly 128 PositionInfo Info[MaxPly]; PositionInfo * CurrentInfo = Info; void make() { CurrentInfo++; * * * // copy/update bishop/rook/queen attacks: * * * } void unmake() { * * * CurrentInfo--; }
Same exact issue, however...