http://rybkaforum.net/cgi-bin/rybkaforu ... #pid354189
As noted by Adam Hair (one of the few readable posts in that thread, it seems), this applies a mix-and-match for the 54%, though I admit it would be easier to determine this if the ICGA material were better organised. I would also say the quoted bit understates the spread of the elsewhere-results with "around". In all events, the "1 in X" number should be of more interest than any raw percentages.Trotsky wrote: But, for now, the BB "results" of
74% Rybka-Fruit "overlap"
54% Crafty-Fruit "overlap"
and around 30% overlap elsewhere
I think this criticism underestimates the process of the ICGA Panel. In any event, Adam Hair interpreted the above criticism to mean not that "results map to Elo", but to Elo difference, that is, there should be a correlation between X% overlap and a margin of Y in rating (I'm not sure why AH didn't renormalise/rescale one axis or the other as part of a linear-regression data analysis, but I agree that any correlation is somewhat unnotable in the first place, particularly in comparison to the Rybka/Fruit outlier). However, my interpretation (also noted in passing by AH) is that the point here is that Rybka/Fruit are the strongest engines in the set, and everything else is much lower-rated -- with the idea being that engines should become more similar as they become stronger and/or their authors have more knowledge. A counterpoint to this latter claim [at least in its current state] would be R3, which is even stronger, yet would not have too great of an "evaluation feature" overlap with engines of interest. Similarly with Stockfish, if you wish to exit the Rybka family for comparison. .Trotsky wrote:When BB presented that paper, somebody should have said "er, but your results map to ELO, do they not? go away, do your homework again and come back with something better that isn't going to make us look stupid".
[...]
If anybody cares to put the program ELO's against BB's "scores" he will find high correlation.
Going back to the Panel, the Elo argument (as to rebutting EVAL_COMP) was partially/tangentially broached in Panel discussions, and essentially rejected. This was both because any correlation seemed to be weak at best, and also because Elo strength is not directly relevant for "originality" purposes. If Rajlich had made this argument, and (say) requested that a similarly-rated engine from 2005-6 submit its source code to be inspected, etc., I fully expect this would have been done -- but he chose not to dispute the issue. [Indeed, usually the accused has to make a specific defense, and can't rely on the sum-total of all possible defenses to be raised on his behalf].
I can also note that EVAL_COMP was only produced due to a desire to ensure that this "evaluation feature" evidence could be sufficiently quantified. Although it formed a large portion of the Fruit/Rybka evidence, the "evaluation feature overlap" was already accepted by many Panel members in its qualitative state, and "evaluation features" themselves were only one part of the total evidence presented. For instance, for "probative similarity" one can note the seemingly literally copied code in the search control and iterative deepening routines, mentioned respectively at the end of Section 6.3.2 and in Appendix A of the RYBKA_FRUIT document.
Copyright law is based upon "substantial similarity" and has a rather low threshold for proof (particularly in Poland). It is rather agnostic as to what the "developmental process" is, though I agree that this could be raised as an affirmative defense [which Rajlich chose not to do in the ICGA process].Trotsky wrote:It is quite impossible to say what the developmental process was, there is no proof that A=Fruit, any better than there is proof that A=developmental Rybka, and Vas kept a copy of Fruit and/or Crafty and/or anything else open on other screens at the final development stage.
I can't find the post anymore, but I also saw one that mentioned that the ICGA process only looked at the evaluation features, but not their relative weightings [and perhaps mocked the Panel for this]. This aspect was discussed, and the general opinion was that evaluation features were already sufficient for non-originality. For the forthcoming legal process by Fabien/FSF, the question of the relative importance of evaluation features and their weightings will be discussed more, if nothing else as a "percentage" basis of copyright infringement. Here it could be noted that relative weightings, at least for purposes of optimising strength, can often be tuned via an automated (thus less creative) process. [Alan Sassler noted that one could try a similar methodology to construct evaluation features, though describing tics of various implementations is nontrivial, and I don't think anyone has made this workable in general -- and even if so, it seems quite unlikely that one would reproduce the Fruit evaluation features from any reasonable-sized set].