Page 5 of 5
Re: New Rating List
Posted: Mon Aug 09, 2010 12:10 pm
by TPJR
Hi gaard,
can you make the same test with Firebird DD, but using PawnHash = 1/8 * Hash? (If you use Hash=512MB, you have already done the test.) In another treat it was explained that the PawnHash should be assigned relatively to Hash.
http://www.open-chess.org/viewtopic.php ... t=40#p5025
Thomas
Re: New Rating List
Posted: Mon Aug 09, 2010 12:35 pm
by gaard
TPJR wrote:Hi gaard,
can you make the same test with Firebird DD, but using PawnHash = 1/8 * Hash? (If you use Hash=512MB, you have already done the test.) In another treat it was explained that the PawnHash should be assigned relatively to Hash.
http://www.open-chess.org/viewtopic.php ... t=40#p5025
Thomas
Sure. I ran the last test with 1/4 PH, but I will go ahead and run another with 1/8. It is not obvious to me why this would make a difference though I am always open to suggestions

I'll report the first 120 games and then again once the 720 game gauntlet has concluded.
Re: New Rating List
Posted: Mon Aug 09, 2010 4:28 pm
by gaard
gaard wrote:TPJR wrote:Hi gaard,
can you make the same test with Firebird DD, but using PawnHash = 1/8 * Hash? (If you use Hash=512MB, you have already done the test.) In another treat it was explained that the PawnHash should be assigned relatively to Hash.
http://www.open-chess.org/viewtopic.php ... t=40#p5025
Thomas
Sure. I ran the last test with 1/4 PH, but I will go ahead and run another with 1/8. It is not obvious to me why this would make a difference though I am always open to suggestions

I'll report the first 120 games and then again once the 720 game gauntlet has concluded.
Code: Select all
4 FireBird 1.1 DD 64 120.0 ( 65.5 : 54.5)
15.0 ( 7.0 : 8.0) Houdini 1.03 134
19.0 ( 6.0 : 13.0) Rybka 4 130
26.0 ( 14.0 : 12.0) Stockfish 1.8 61
14.0 ( 5.5 : 8.5) Critter 0.80 2
27.0 ( 18.0 : 9.0) Naum 4.2 -3
19.0 ( 15.0 : 4.0) Spark 0.4 -125
Rank Name Elo + - games score oppo. draws
1 Houdini 1.03 134 19 19 984 72% -27 35%
2 Rybka 4 130 19 19 999 70% -24 32%
3 IvanHoe 9.52a 119 22 22 708 68% -16 36%
4 FireBird 1.1 DD 64 51 51 120 55% 31 39%
5 Stockfish 1.8 61 19 19 1006 60% -15 33%
6 Critter 0.80 2 34 34 302 51% -12 32%
7 Naum 4.2 -3 18 18 1009 50% -8 35%
8 Shredder 12 32-bit -48 20 20 876 45% -15 32%
9 Spark 0.4 -125 19 19 987 33% 5 30%
10 Zappa Mexico II -142 20 20 969 30% 6 31%
11 Toga II 1.4beta5c -193 21 21 968 24% 12 25%
Re: New Rating List
Posted: Tue Aug 10, 2010 6:31 pm
by gaard
TPJR wrote:Hi gaard,
can you make the same test with Firebird DD, but using PawnHash = 1/8 * Hash? (If you use Hash=512MB, you have already done the test.) In another treat it was explained that the PawnHash should be assigned relatively to Hash.
http://www.open-chess.org/viewtopic.php ... t=40#p5025
Thomas
Match concluded!
Code: Select all
Ratings:
Rank Name Elo + - games score oppo. draws
1 Houdini 1.03 134 18 18 1089 70% -17 37%
2 Rybka 4 129 18 18 1100 69% -15 33%
3 IvanHoe 9.52a 118 22 22 708 68% -17 36%
4 FireBird 1.1 DD 86 21 21 720 58% 31 42%
5 Stockfish 1.8 60 18 18 1100 59% -8 35%
6 Naum 4.2 -2 17 17 1102 49% -1 36%
7 Critter 0.80 -9 29 29 408 46% 13 33%
8 Shredder 12 32-bit -49 20 20 876 45% -17 32%
9 Spark 0.4 -127 19 19 1088 31% 11 30%
10 Zappa Mexico II -144 20 20 969 30% 4 31%
11 Toga II 1.4beta5c -195 21 21 968 24% 11 25%
Details:
4 FireBird 1.1 DD 86 720.0 (415.5 : 304.5)
120.0 ( 49.5 : 70.5) Houdini 1.03 134
120.0 ( 50.5 : 69.5) Rybka 4 129
120.0 ( 65.5 : 54.5) Stockfish 1.8 60
120.0 ( 75.5 : 44.5) Naum 4.2 -2
120.0 ( 78.5 : 41.5) Critter 0.80 -9
120.0 ( 96.0 : 24.0) Spark 0.4 -127
Likelihood of Superiority:
Hou Ryb Iva Fir Sto Nau Cri Shr Spa Zap Tog
Houdini 1.03 656 875 999 999100010001000100010001000
Rybka 4 343 788 999 9991000 9991000100010001000
IvanHoe 9.52a 124 211 976 9991000 9991000100010001000
FireBird 1.1 DD 0 0 23 975 999 9991000100010001000
Stockfish 1.8 0 0 0 24 999 9991000100010001000
Naum 4.2 0 0 0 0 0 655 999100010001000
Critter 0.80 0 0 0 0 0 344 987 999 9991000
Shredder 12 32-bit 0 0 0 0 0 0 12 999 9991000
Spark 0.4 0 0 0 0 0 0 0 0 900 999
Zappa Mexico II 0 0 0 0 0 0 0 0 99 999
Under these conditions, the chance of FireBird 1.1 DD being better than Rybka or Houdini is less than .001 and it rates very close to the default version and almost identically with RobboLito. This version of FireBird is so close to RobboLito and scales just as poorly with more time, IMO, that I seriously doubt it would do better with a longer time control.