Page 1 of 2

Lots of bugs

Posted: Thu Feb 04, 2016 1:58 am
by ppyvabw
Sorry for what might all sound like a blatant defeatist moan, but I don't mean for it to be -- rather just looking for tips from more experienced developers who may have had similar problems when they began. I have just got my engine working with xboard, and being able to play against other engines without having to babysit has brought to light a number of problems, which I have only partially resolved.

It did bring to light a problem with my futility pruning. It was running faster without it, but I managed to solve that particular issue, although my poor old brain can't get around why it would slow it down. I'm happy that's ok now however.

Secondly, when Eva is white, and the opening goes:
e4, e5
d4, e5xd4
she seems to willingly give up the pawn. She doesn't recapture with the queen, but just attacks it again with the king's knight and spends the next few moves hopelessly trying to win it back. I'm hoping this may just be an evaluation tuning issue, but disabling bits of my evaluation function corrects it, only to find that changing a line of code in my search function makes it do it again. And I found this with my previous attempt at writing an engine (which shares much of the same design as Eva) so it seems just like that is what it naturally wants to do, like it's at some kind of minimum point and any deviation from that point will result in it just falling back into the minimum. But whatever the reason, even as an average chess player, it is wrong!

Also, my PV arbitrarily cuts off. I have partially managed to solve this -- it was much worse with each depth iteration only being 2 or 3 moves long. Now, it can be at depth 9 with a pv 9 moves long say, and at depth 10 there might only be 1 move in the pv. The replacement scheme for my hashtable goes thus (probably easier to just give the code than try to explain it in words):

Code: Select all

    if (TT[g%Nh].stale()) TT[g%Nh]=x;
    else if (x.node()==pv) TT[g%Nh]=x;
    else if (g==TT[g%Nh].key() && TT[g%Nh].depth()<x.depth() && TT[g%Nh].node()!=pv) TT[g%Nh]=x;
At each call to my search function, I check the hashtable, and if it's a pv node I just return the score. If it's a cut node, I return the TT score if it's greater than the current beta and the depth is greater. If it's an all node I return the score if it's lower than the current alpha and the depth is greater. I refresh the stale flag if a node turns out to still be useful. Also, after the first move, my PV (as printed to the screen) is extended. I think that's just because I extract the PV from the TT table, and it's picking up old entries, but I don't understand why.

Finally, and I think this is related to the previous problem, is that as the game goes on, my program slows down massively. I would obviously expect it to as the game gets more complicated, but on the positions that it is particularly slow on (~half an hour), I write them down and set it going from scratch from that position and it takes only 5 minutes say to depth 15. With my extensions set as they are, it gets to a maximum depth of the low 20s, not including quiescence. My theory is that the hash table is getting filled up with entries, and never getting replaced, but I'm not sure why. At present my hashtable's size is 10^8 entries, which takes up like 6 gigabytes, which I am sure is quite unnecessary. Previously I thought it may be due to search explosion, so I really cut down my search extensions to fractions of a ply, and never more than 1 in total, but it's still slowing down. I clear the killer tables and the history tables at the beginning of each search.

Re: Lots of bugs

Posted: Thu Feb 04, 2016 2:06 am
by ppyvabw
ppyvabw wrote:Sorry for what might all sound like a blatant defeatist moan, but I don't mean for it to be -- rather just looking for tips from more experienced developers who may have had similar problems when they began. I have just got my engine working with xboard, and being able to play against other engines without having to babysit has brought to light a number of problems, which I have only partially resolved.

It did bring to light a problem with my futility pruning. It was running faster without it, but I managed to solve that particular issue, although my poor old brain can't get around why the particular error I found with it would slow it down. I'm happy that's ok now however.

Secondly, when Eva is white, and the opening goes:
e4, e5
d4, e5xd4
she seems to willingly give up the pawn. She doesn't recapture with the queen, but just attacks it again with the king's knight and spends the next few moves hopelessly trying to win it back. I'm hoping this may just be an evaluation tuning issue, but disabling bits of my evaluation function corrects it, only to find that changing a line of code in my search function makes it do it again. And I found this with my previous attempt at writing an engine (which shares much of the same design as Eva) so it seems just like that is what it naturally wants to do, like it's at some kind of minimum point and any deviation from that point will result in it just falling back into the minimum. But whatever the reason, even as an average chess player, it is wrong!

Also, my PV arbitrarily cuts off. I have partially managed to solve this -- it was much worse with the pv of each depth iteration only being 2 or 3 moves long. Now, it can be at depth 9 with a pv 9 moves long say, but at depth 10 there might only be 1 move in the pv. The replacement scheme for my hashtable goes thus (probably easier to just give the code than try to explain it in words):

Code: Select all

    if (TT[g%Nh].stale()) TT[g%Nh]=x;
    else if (x.node()==pv) TT[g%Nh]=x;
    else if (g==TT[g%Nh].key() && TT[g%Nh].depth()<x.depth() && TT[g%Nh].node()!=pv) TT[g%Nh]=x;
At each call to my search function, I check the hashtable, and if it's a pv node I just return the score. If it's a cut node, I return the TT score if it's greater than the current beta and the depth is greater. If it's an all node I return the score if it's lower than the current alpha and the depth is greater. I refresh the stale flag if a node turns out to still be useful. Also, after the first move, my PV (as printed to the screen) is extended. I think that's just because I extract the PV from the TT table, and it's picking up old entries, but I don't understand why.

Finally, and I think this is related to the previous problem, is that as the game goes on, my program slows down massively. I would obviously expect it to as the game gets more complicated, but on the positions that it is particularly slow on (~half an hour), I write them down and set it going from scratch from that position and it takes only 5 minutes say to depth 15. With my extensions set as they are, it gets to a maximum depth of the low 20s, not including quiescence. My theory is that the hash table is getting filled up with entries, and never getting replaced, but I'm not sure why. At present my hashtable's size is 10^8 entries, which takes up like 6 gigabytes, which I am sure is quite unnecessary. Previously I thought it may be due to search explosion, so I really cut down my search extensions to fractions of a ply, and never more than 1 in total, but it's still slowing down. I clear the killer tables and the history tables at the beginning of each search.

Re: Lots of bugs

Posted: Thu Feb 04, 2016 8:02 pm
by hyatt
In English:

if (TT[g%Nh].stale()) TT[g%Nh]=x;

if this is from an old search, just replace it on the spot. What, precisely, does stale() test? Does stale mean it came from the last full search from the root, or just from a previous iteration, possibly in this search??

else if (x.node()==pv) TT[g%Nh]=x;

If this is a pv (I assume value > alpha and value < beta) then store here period.

else if (g==TT[g%Nh].key() && TT[g%Nh].depth()<x.depth() && TT[g%Nh].node()!=pv) TT[g%Nh]=x;

This one is probably wrong. If the key matches, and the current depth is > than table depth store. You really MUST store any position once it is searched, no exceptions. In the above case, think about this. You have a table entry with draft = 10 (depth). At the current search (with depth=9) you probe and find this entry, but the bound does not let you terminate the search. So you do the search, and then decide to keep the original entry (greater depth) even though it was useless here. That's certainly a flaw.

Bottom line is you ALWAYS store the current position in the table, no exceptions. If you move to a bucket approach as I use, you have 4 entries and THERE you can use the depth to make the decision, but only from the perspective of "which entry am I going to overwrite", not "am I going to overwrite one of these or just forget about it?" You NEVER "forget about it"..

Re: Lots of bugs

Posted: Fri Feb 05, 2016 8:46 am
by H.G.Muller
ppyvabw wrote:Secondly, when Eva is white, and the opening goes:
e4, e5
d4, e5xd4
she seems to willingly give up the pawn. She doesn't recapture with the queen, but just attacks it again with the king's knight and spends the next few moves hopelessly trying to win it back.
For problems like that it always helps me greatly to be able to trace WHY exactly the engine plays a move that doesn't seem to make any sense. To do that efficiently, I make the engine aware of its current location in the tree by keeping track of the ply level, and by storing every move it makes in an array path[level]. I then put a conditional print statement directly after UnMake(), which prints level, remaining depth and (if applicable) iteration depth (so you can see what node caused the printing), the current move, its score and the best score so far. The condition for printing selects the node I am interested in, starting with the root.

This way I get a list of moves + scores in the node(s) of interest, which then shows which move was wrongly evaluated. (Like not seeing it could gain a Pawn, or wrongly imagining it could still gain it later.) I then follow up that move to print the same list in the daughter node, etc. Either I arrive at the leaves,where the static evaluation makes a mistake, or I encounter a pruning error on the way to it. Or you end in a hash cutoff. So I also put a print statement (subject to the same condition) that prints the hash key when there is a cutoff. If it seems that the hash table contained a wrong score for the position, I then add a print statement at the hash store, which prints the path to the current node when there is a store with that key. I can then identify the node that was responsible for producing that hashed score, and continue from there.

Eventually this will always get you at the problem. And if you made the infra-structure to do this efficient (e.g. use a Match() routine to compare the path to the current node with the path along which you want to print, the latter being given on the command line of your executable) you can actually do it very rapidly. Just run the engine on the position where it makes the error, and based on the output add the suspect move to the command line and run it again, etc.

Re: Lots of bugs

Posted: Sat Feb 06, 2016 2:14 am
by ppyvabw
if (TT[g%Nh].stale()) TT[g%Nh]=x;

if this is from an old search, just replace it on the spot. What, precisely, does stale() test? Does stale mean it came from the last full search from the root, or just from a previous iteration, possibly in this search??
Stale means it came from the last full search from the root. I reset it before each search.
else if (x.node()==pv) TT[g%Nh]=x;

If this is a pv (I assume value > alpha and value < beta) then store here period.
Yes, that is what that means. What if, say, you're searching to depth n, using the pv from depth n-1, but you find a 'better' combination at depth n, which has been searched to a lower depth along branches that have been reduced? Should there be a clause in that if statement that only replaces pv nodes with pv nodes searched to a higher depth? Otherwise at the next iteration, your pv may only be a few moves long, if it's along a branch of the tree that has been heavily reduced?
else if (g==TT[g%Nh].key() && TT[g%Nh].depth()<x.depth() && TT[g%Nh].node()!=pv) TT[g%Nh]=x;

This one is probably wrong. If the key matches, and the current depth is > than table depth store. You really MUST store any position once it is searched, no exceptions. In the above case, think about this. You have a table entry with draft = 10 (depth). At the current search (with depth=9) you probe and find this entry, but the bound does not let you terminate the search. So you do the search, and then decide to keep the original entry (greater depth) even though it was useless here. That's certainly a flaw.
It's late here, so I'll have to think about that one tomorrow.

Thanks for the help; this stuff is probably so basic for you guys. Also, thanks H.G.Muller; I'll have to think about your advice in the morning as well.

Adam

Re: Lots of bugs

Posted: Sat Feb 06, 2016 9:31 am
by H.G.Muller
ppyvabw wrote:Yes, that is what that means. What if, say, you're searching to depth n, using the pv from depth n-1, but you find a 'better' combination at depth n, which has been searched to a lower depth along branches that have been reduced? Should there be a clause in that if statement that only replaces pv nodes with pv nodes searched to a higher depth? Otherwise at the next iteration, your pv may only be a few moves long, if it's along a branch of the tree that has been heavily reduced?
PV nodes have an exact score, which can be used with any window. So the only reason to search a PV node that was already in the table is that depth in the table is lower than the depth that you need now. So the situation you sketch simply cannot occur. You either take a hash cutoff before searching, or you search, and will get a result for a larger depth, which should replace the lower depth in the table.

Re: Lots of bugs

Posted: Tue Feb 09, 2016 2:10 am
by ppyvabw
H.G.Muller wrote:PV nodes have an exact score, which can be used with any window. So the only reason to search a PV node that was already in the table is that depth in the table is lower than the depth that you need now. So the situation you sketch simply cannot occur. You either take a hash cutoff before searching, or you search, and will get a result for a larger depth, which should replace the lower depth in the table.
Hey,

Sorry; I don't understand. Or rather, I understand what you said, but I don't think it answers my confusion because I perhaps didn't explain it very well.

Say you're searching the root to depth n with bounds (-infinity, infinity). The root is always a pv node, so it's an exact score, but every move will be searched. It plays the PV from the previous iteration first to depth n, and the score of the first child node returns to root. Now, it searches the remaining children of the root -- all of them because the root is a pv node. Some of these sibling nodes may be searched to a lower depth because of reductions further up the tree -- i.e late move reductions for moves that never score highly in the history tables. Say one of the sibling nodes increases the score over the current PV -- that move will get replaced in the transposition table as the new pv, even though it has been searched to a lower depth, unless I have some clause in my replacement scheme that only replaces PV nodes if they are searched to greater depth. (I think)

Re: Lots of bugs

Posted: Tue Feb 09, 2016 8:10 pm
by H.G.Muller
ppyvabw wrote: -- that move will get replaced in the transposition table as the new pv, even though it has been searched to a lower depth, ...
Lower depth than what?

For one, it was NOT in the TT as a PV node, but as a cut node. And if the depth for that node stored in the PV was equal or higher to what you need now, it would have been good enough to also refute this alternate move in the root now, unless the score of the PV dropped too much. But since it is from the previous iteration, it is actually likely to have a stored depth that is one lower than what you need now. If the move is reduced now, why wouldn't it have been reduced in the previous iteration? It would be a bit strange if when the root was, say, at depth 9 with a score of +0.50 on the previous iteration, and this move was rejected because the score was <= +0.40 at the reduced depth of 8, (and so that the daughter was stored in the TT as a cut node with score >= -0.40 and depth 7), that you get to depth 10 in the root, where the score of the previous PV move dropped to +0.10, and would now only want to search that move at a depth <= 7 (and the daughter thus at depth 6).

But even if you do, I don't see the problem. It means that you now request the daughter with depth 6 and window {-inf, -0.10}. What you find in the TT has depth 8, but alas, a useless bound of >= -0.40, meaning the score can both be < -0.10 and > -0.10. So you have to search it (fortunately only at depth 6) to figure that out, and you will overwrite the old entry, as a depth=6 entry with a relevant score bound might save you some work in the future, while a depth=8 entry with useless bounds will never save you any work, and just wastes table space.

If the search reveals the daughter is now a PV node, with a score -0.30, meaning that the move in the root leading to it gets score +0.30, which is better than the old PV move, which had +0.10, you will undo the reduction, and re-search the move at depth=10 (as PV moves should never be reduced). So you arrive in the same daughter again, this time with a search request for depth=9, while the TT only hold a depth=6 entry for it. So you have to search again, and the resulting depth=9 entry will overwrite the depth=6 result, whether it is now an exact score <= -0.10, or a lower bound >= -0.10 after all.

Re: Lots of bugs

Posted: Wed Feb 10, 2016 1:42 am
by ppyvabw
Thanks for your reply.
H.G.Muller wrote:Lower depth than what?
Lower depth than the current PV.

It looks like I probably have some big misunderstandings about what is happening during the search, so this evening I have stripped off most of the bells and whistles. I have commented out all move reductions, extensions, pruning, internal iterative deepening and returns from the TT table, to put them back in and debug them one-by-one.

With just my search function to a nominal, but iteratively increasing depth, and TT probes to get the PV and hash moves (I have retained null pruning), it works fine: the PV does what I would expect it to and doesn't reduce. So on that basis, I think I can eliminate TT clashes for the shortening of the PV. So I'm taking this as a minimal working example of my program.

As soon as I put back the returns from the TT table (so literally just un-comment the code below), the PV shortens. So, for example from a test position that I am using from a game against gnuchess where my engine was particularly slow, at depth 8 the PV is only 3 moves long.

At the beginning of each call to my search routine I have this:

Code: Select all

    if (!is_pv_node && TT_temp.depth()>=depth){
        switch(TT_temp.node()){
            case pv:{
                 return TT_temp.score();        
                break;
            }
            case all:{
                if (beta>TT_temp.score()) beta=TT_temp.score();
                break;
            }            
            case cut:{
                if (alpha < TT_temp.score()) alpha=TT_temp.score();
                break;
            }            
        }
        if (alpha>=beta) {return alpha;}
    }
TT_temp is just a temporary copy of the TT entry if there is a TT hit.
So in English, only consider TT entries that are a greater depth than I want now,
If it's a PV node, the score is exact so just return it.
If it's a cut node, the TT score is a lower bound, so we can safely raise alpha if it's lower than the lower bound.
If it's an all node, the TT score is an upper bound, so we can safely lower beta if it's greater than the upper bound.
If after all that, alpha is greater than beta we can fail high.

I don't think that is wrong -- happy to be told if it is -- but I don't see why it would cause the PV to shorten, especially as I don't even do the above on PV nodes.

Cheers,

Adam

Re: Lots of bugs

Posted: Wed Feb 10, 2016 8:20 am
by H.G.Muller
Indeed, something must be badly wrong. If you don't do this in PV nodes, you should never get a shortened PV. Either there must be something wrong with the way you collect the PV, or with the search. Or with your node classification.

But staring at the code is usually not an efficient way to make progress. Just print the info on the TT probe and search window when you enter the node, and the list of moves searched there and their scores, in the node that is at the end of your printed PV. That will show you within minutes why there doesn't seem to be a PV move in that node. Or if there is, with some extra printing, why the PV belonging to that move was not appended to the PV that you backed up to the root.