Rebel wrote:BB+ wrote: I'm not sure there is a "canonical" Fruit executable. In any event, the instruction should be
fnstsw. For Rybka, this appears in Rick Fadden's
disassembly at
0x4097F0. Running
objdump on the Fruit 2.1 executable turned up about 5 places in
parse_go with this instruction. The relevant source code is:
Code: Select all
if (movetime >= 0.0) { // fixed time
SearchInput->time_is_limited = true;
SearchInput->time_limit_1 = movetime * 5.0; // HACK to avoid early exit
SearchInput->time_limit_2 = movetime;
I have never looked at the 32-bit dumps too closely, so Zach would know more.
The Rick Fadden post was helpful, it actually states the below block of assembly should represent the notorious integer compare with 0.0
Code: Select all
text:004097E6 loc_4097E6: ; CODE XREF: Start_Go+2BCj
.text:004097E6 fild [esp+2Ch+movetime]
.text:004097EA fcomp ds:dbl_6623D0
.text:004097F0 fnstsw ax
.text:004097F2 test ah, 41h ; 41h = 65 Decimal
.text:004097F5 jnz short loc_40980E
.text:004097F7 lea ecx, [esi+esi*4]
.text:004097FA imul esi, 3E8h ; 3E8h = 1000 Decimal
.text:00409800 mov time_limit_1, ecx
.text:00409806 mov time_limit_2, esi
.text:0040980C jmp short loc_409846
With all the best will in the world, it does not read as
if (movetime >= 0.0)
Perhaps Gerd can shed his light on this one.
I can take a stab at it.
Here's the comparison instructions, and I have added a comment after each one describing what it does:
Code: Select all
.text:004097E6 fild [esp+2Ch+movetime] ; Load the movetime variable onto FP stack
.text:004097EA fcomp ds:dbl_6623D0 ; Compare ST(0) (the top of FP stack) with a constant from memory
; (at ds:0x6623D0 you'll presumably find the binary value for 0.0)
.text:004097F0 fnstsw ax ; Copy FPU status word into AX
.text:004097F2 test ah, 41h ; test the C0 and C3 bits of the FPU status word
.text:004097F5 jnz short loc_40980E
Note that FILD loads a
32-bit integer value and converts it into a float.
I found this table that shows what the bits of the FPU status word mean. The AH register being tested is the upper 8-bits of the 16-bit AX register.
The 41h constant tested is a combination of the "C0" flag (bit 8 of the 16-bit status word) and the "C3" flag (bit 14). So basically the JNZ branch is taken if either C0 or C3 was set by the FCOMP. C0 will be set if (ST(0) < source) or the result of the comparison was undefined (i.e. NaN). C3 will be set if (ST(0) == source) or the result was undefined.
So basically, the FCOMP, FNSTSW, TEST and JNZ are equivalent to this code:
Code: Select all
if (movetime <= 0.0 || (movetime is NaN)))
goto loc_40980E;
It looks equivalent to:
Code: Select all
if (movetime > 0.0)
{
// ... do stuff
}
loc_40980E:;
Notice that it looks to me like the Rybka code is equivalent to "if (movetime > 0.0)", not like "if (movetime >= 0.0)". Unless I made a mistake when parsing it.. I don't know why it would have been changed. I think if the compiler really wanted to test (movetime >= 0.0) it could have tested AH against 21h instead of 41h to accomplish that.
Rebel wrote:
Also I made a simple program:
main (argc,argv)
int argc; char *argv[];
{ int test=0;
if (test >= 0.0) { };
}
My compiler simply returned:
public _main
_TEXT segment
assume CS:_TEXT
_main:
xor EAX,EAX
ret
How nice, no complicated fnstsw stuff.
The explanation is simple!
Your compiler is smarter than the one that was used to compile Rybka 1.0 Beta. My guess is that it was compiled with either the ancient Microsoft VC 6 compiler (from last century), or VC 7.0 (which is from 2002) or VC 7.1 (which is from 2003). [Edit: actually, even VC 6 might have optimized away your if-test. You didn't put any code inside the curly brackets, and movetime is not volatile, so the compiler knows that there's no side effects at all from the comparison or the if-statement body. It knows that its dead code, and it optimizes it away. The assembly it outputted just does "return 0;". ]
Compiler options can also have an effect on what gets generated. For example, there are other ways of testing the result of a floating-point comparison done in the x87 registers; but you would have to tell the compiler that your targeted CPU was at least a Pentium II or it would not be allowed to use them. I'm not sure whether it would even use them anyway; I don't have a copy of any of those three really old compilers anymore.
We did see signs that at least some of the initialization code in the Rybka 1.0 Beta appeared to have been compiled for Debug. Not just with a really stupid compiler, but with optimizations turned off. I can't imagine why though.
If you look at
these disassembly listings on the ICGA wiki, you'll see a bunch of redundant instructions, and stuff like that. Here is a comment I posted when we were discussing it:
Wylie wrote:I just noticed the Rybka-Crafty Evidence III page that has been put up, and looked at the whole snippet. That has to have been compiled for debug. There's no loop induction var for (pawn_hash_table+i), there's a totally useless lea at 0x4520ea, and of course those retarded redundant loading of constants. It reloads the count to stop at from 0x6b8990 on every iteration.
I'm surprised that a chess engine would have debug-compiled code in it! Weird.
But Mark Watkins pointed out the purpose of the "totally useless lea at 0x4520ea" -- it is a multi-byte nop to align the first instruction of the loop.