Endgame recognition and ELO
Endgame recognition and ELO
hi, which is the gain in elo when using endgame function as KPK KNK KNNKN etc. (interior node recognition NO tablebase)?
my tests are failing and gain very few elo
my tests are failing and gain very few elo
-
- Posts: 190
- Joined: Sun Jul 14, 2013 10:00 am
- Real Name: H.G. Muller
Re: Endgame recognition and ELO
I once experimented in this respect with Fairy-Max, having it keep a count for each piece type, and discounting scores for a leading side that had 0 or 1 Pawn by a factor 2, 4 or 8. This 'Pair-o-Max' was surprisingly stronger than the regular Fairy-Max, nearly 100 Elo IIRC. But it was not the only change; I also used the piece counts to award a bonus for the Bishop pair, awarding extra score when the piece count for Bishops went from even to odd. This allowed me to pick better piece values, and solved the problem that Fairy-Max would squander a B-pair advantage (BB vs BN) by trading Bishops (which it could do no matter how you set the B and N value). I suspect a large part of the Elo rise must have been due to that.
It probably depends a lot on which end-games exactly you recognize. Just discounting pawnless end-games probably would not bring a lot, as it just makes the engine avoid the opponent sacrificing a minor for the last Pawn, at the expense of not being able to advance that Pawn, which makes it just as much a draw 50 moves later.
In Pair-o-Max I assumed a end-game without Pawns for the strong side with <= 2 pieces against <= 1 piece (excl. Kings and defender Pawns) would be a draw when not more than 350 cP ahead according to naive material counting. Such 'drawn' end-games themselves would then be discounted by a factor 8. When more than 350 cP ahead under the same conditions they would be considered 'difficult wins', and these would still be discounted by a factor 2. Except when the defender had nothing but King and Pawns, in which case there was no discount at all. If the strong side had exactly one Pawn, it would cancel the opponent's least-valuable piece, and a remaining advantage of less than 350 cP under the same conditions as above would be discounted a factor 4 instead of 8. So in summary
K + <= 2 pieces vs K + 1 piece [+ pawns?]:
drawn: divide by 8
difficult: divide by 2
K + <= 2 pieces vs K [+ pawns?]
drawn: divide by 8
other: no discount
K + <= 2 pieces + 1 pawn vs K + 2 pieces [+ pawns?]
drawn after piece-for-pawn sac: divide by 4
other: no discount
The 350-cP limit for qualifying as 'drawn' was subject to some exceptions:
When the leading side had...
1) a pair of pieces indicated as 'defective pair' it would be considered drawn irrespective of true material value
2) a single piece that was color bound, it would be considered drawn irrespective of true material value
3) a piece indicated as 'mating minor', it would NOT be considered drawn irrespective of true material value
4) a single piece, and the (remaining) piece of the lagging side was indicated as 'tough defender', it would be considered drawn irrespective of true material value
In orthodox Chess only case (1) is needed, for a pair of Knights. These rules would cause KBNPKNN (which could still involve a substantial advantage if the Pawn is on 7th rank) to be discounted by a factor 4, and KNNPKB and KNNPKR by a factor 4 as well (by virtue of exception 1). The 'mating minor', 'color-bound major' and 'tough defender' cases only occur in Chess variants. (E.g. a non-royal King M is both a mating minor and a tough defender; KMK is won and KQKM is draw, but it is worth only as much as a Bishop.)
It probably depends a lot on which end-games exactly you recognize. Just discounting pawnless end-games probably would not bring a lot, as it just makes the engine avoid the opponent sacrificing a minor for the last Pawn, at the expense of not being able to advance that Pawn, which makes it just as much a draw 50 moves later.
In Pair-o-Max I assumed a end-game without Pawns for the strong side with <= 2 pieces against <= 1 piece (excl. Kings and defender Pawns) would be a draw when not more than 350 cP ahead according to naive material counting. Such 'drawn' end-games themselves would then be discounted by a factor 8. When more than 350 cP ahead under the same conditions they would be considered 'difficult wins', and these would still be discounted by a factor 2. Except when the defender had nothing but King and Pawns, in which case there was no discount at all. If the strong side had exactly one Pawn, it would cancel the opponent's least-valuable piece, and a remaining advantage of less than 350 cP under the same conditions as above would be discounted a factor 4 instead of 8. So in summary
K + <= 2 pieces vs K + 1 piece [+ pawns?]:
drawn: divide by 8
difficult: divide by 2
K + <= 2 pieces vs K [+ pawns?]
drawn: divide by 8
other: no discount
K + <= 2 pieces + 1 pawn vs K + 2 pieces [+ pawns?]
drawn after piece-for-pawn sac: divide by 4
other: no discount
The 350-cP limit for qualifying as 'drawn' was subject to some exceptions:
When the leading side had...
1) a pair of pieces indicated as 'defective pair' it would be considered drawn irrespective of true material value
2) a single piece that was color bound, it would be considered drawn irrespective of true material value
3) a piece indicated as 'mating minor', it would NOT be considered drawn irrespective of true material value
4) a single piece, and the (remaining) piece of the lagging side was indicated as 'tough defender', it would be considered drawn irrespective of true material value
In orthodox Chess only case (1) is needed, for a pair of Knights. These rules would cause KBNPKNN (which could still involve a substantial advantage if the Pawn is on 7th rank) to be discounted by a factor 4, and KNNPKB and KNNPKR by a factor 4 as well (by virtue of exception 1). The 'mating minor', 'color-bound major' and 'tough defender' cases only occur in Chess variants. (E.g. a non-royal King M is both a mating minor and a tough defender; KMK is won and KQKM is draw, but it is worth only as much as a Bishop.)
-
- Posts: 1242
- Joined: Thu Jun 10, 2010 2:13 am
- Real Name: Bob Hyatt (Robert M. Hyatt)
- Location: University of Alabama at Birmingham
- Contact:
Re: Endgame recognition and ELO
There are a few terribly tricky cases. For example, KRB vs KR is not always drawn. If the losing king is trapped on the edge of the board by the rook, this is a forced win, even with a "color-bound" piece and an advantage of only one piece. I got burned by this case a couple of times before I fixed it and stopped always considering R+minor vs R as a draw. There are other useful cases such as KRP vs KR where the losing king is in front of the pawn is almost always drawn.
-
- Posts: 190
- Joined: Sun Jul 14, 2013 10:00 am
- Real Name: H.G. Muller
Re: Endgame recognition and ELO
It depends a lot on how you use the information. If you use it just to discount the static evaluation, it should be best to go for the most-common case, and in the overwhelming majority of cases KRKB is a draw. I see no reason why assuming a draw when it is actually a win would be more harmful than assuming a win when it is actually a draw, so that only the frequency of occurrence is important. If you evaluated a draw in a won position, deeper search should at some point find the winning line, where you mate or convert to KRK, which would no longer be discounted.
If you prune assumed draws at any depth, however, you have to be more careful, as a wrong assumption would never be corrected.
Also note that there is a difference between end-games that can be classified as drawish just based on the material key without position information, such as KBPKB, (where the chances to advance the P while dodging a B-for-P sacrifice are very slim), and cases which only become drawish with a certain (often rare) positioning of the pieces (such as KBPK and KQKP, or KPK and KRPKR).
If you prune assumed draws at any depth, however, you have to be more careful, as a wrong assumption would never be corrected.
Also note that there is a difference between end-games that can be classified as drawish just based on the material key without position information, such as KBPKB, (where the chances to advance the P while dodging a B-for-P sacrifice are very slim), and cases which only become drawish with a certain (often rare) positioning of the pieces (such as KBPK and KQKP, or KPK and KRPKR).
Re: Endgame recognition and ELO
I launched 8500 game between stockfish 6 and stockfish 6 without endgame
in file endgame.cpp I removed the contents of the constructor
and the result is very strange because wins the version without endgame:
in file endgame.cpp I removed the contents of the constructor
Code: Select all
Endgames::Endgames() {
//add<KPK>("KPK");
//add<KNNK>("KNNK");
etc.
}
Code: Select all
Rank Name Elo + - games score oppo. draws
1 stockfish_NOendgame 5 4 5 8500 52% -5 41%
2 stockfish_original -5 5 4 8500 48% 5 41%
-
- Posts: 190
- Joined: Sun Jul 14, 2013 10:00 am
- Real Name: H.G. Muller
Re: Endgame recognition and ELO
Interesting. Which fraction of the games was decided long before the end-game?
Re: Endgame recognition and ELO
sorry I did not understand your questionH.G.Muller wrote:Interesting. Which fraction of the games was decided long before the end-game?
-
- Posts: 190
- Joined: Sun Jul 14, 2013 10:00 am
- Real Name: H.G. Muller
Re: Endgame recognition and ELO
Well, you played 8500 games. If 8400 of those games ended in a checkmate with 20 pieces on the board, you can be sure the end-game code did not do anything to affect the engine's move choice in those games. So if the results are not equal in those games it must be because the disabled end-game code caused a slowdown. If, however, all 8500 games ended with <= 5 men on the board, it is likely that the end-game code had a large role in deciding which 5 men those are, and consequently whether the result was checkmate or a draw.
Re: Endgame recognition and ELO
yes, all 8500 games have <= 5 men, for this reason it seems to me a strange result.
Here the link to results pgn file
https://drive.google.com/file/d/0ByhnaX ... sp=sharing
Here the link to results pgn file
https://drive.google.com/file/d/0ByhnaX ... sp=sharing
-
- Posts: 190
- Joined: Sun Jul 14, 2013 10:00 am
- Real Name: H.G. Muller
Re: Endgame recognition and ELO
Oh, I see. You already started from positions with <=5 men.
I am not sure what you would have expected the code that you disabled to do here. I can hardly imagine Stockfish has knowledge in it for how to play, say, KRKN. I had expected the routines you commented out to just affect the evaluation of the various material compositions, e.g. divide the evaluation of any KRKN position by 16, so it becomes something like +0.12 instead of +2.00. And not to recognize the cases where you could actually trap the Knight; you would expect the normal search to find that quickly enough, when it is possible.
Once you are in KRKN there is no benefit whatsoever in knowing it is a draw. Even if you would think it is +2.00, you would still score only 1/2 point fifty moves later. The benefit of the knowledge that KRKN is in general a draw comes from avoiding the position when you are still in KRNPKRN, so that when you have the opportunity to convert to KRPKR with a +1.00 score you would prefer it over 'gaining' the exchange for a Pawn because you mistakenly think that is +2.00, and thus better. But you wouldn't ever see that benefit if you already start it in 4- or 5-men positions.
So I think your 52% result is just statistical noise. Another thing that is remarkable is that some games start with a Pawn on the first rank??? (e.g. 3928/3929)
I am not sure what you would have expected the code that you disabled to do here. I can hardly imagine Stockfish has knowledge in it for how to play, say, KRKN. I had expected the routines you commented out to just affect the evaluation of the various material compositions, e.g. divide the evaluation of any KRKN position by 16, so it becomes something like +0.12 instead of +2.00. And not to recognize the cases where you could actually trap the Knight; you would expect the normal search to find that quickly enough, when it is possible.
Once you are in KRKN there is no benefit whatsoever in knowing it is a draw. Even if you would think it is +2.00, you would still score only 1/2 point fifty moves later. The benefit of the knowledge that KRKN is in general a draw comes from avoiding the position when you are still in KRNPKRN, so that when you have the opportunity to convert to KRPKR with a +1.00 score you would prefer it over 'gaining' the exchange for a Pawn because you mistakenly think that is +2.00, and thus better. But you wouldn't ever see that benefit if you already start it in 4- or 5-men positions.
So I think your 52% result is just statistical noise. Another thing that is remarkable is that some games start with a Pawn on the first rank??? (e.g. 3928/3929)