On Dalke
Posted: Sun Feb 19, 2012 11:09 pm
Since Dalke seemed a bit unaware of many important facets of the Rybka case (and Schröder seems to have done nothing to rectify this), I felt it wise not to comment previously. However, Riis/ChessBase have now promoted his writings, and thus I shall rebut them.
Onto his writing:
Dalke then goes on about clean rooms for about 5 paragraphs, but I don't think anything he says is relevant. He seems to only think there is some "algorithmic influence" from Fruit in Rybka, whereas the investigation concluded there was specific creative content derived from Fruit in Rybka. This is (in particular) where his analogy to operating systems fails, where non-literal elements are mostly nonexistent.
Dalke then suggests (sotto voce) that someone who does a meta-analysis frequently has "an economic, social, or political agenda such as the passage or defeat of legislation." He then continues his quotation from Wikipedia with the favored authors may themselves be biased or paid to produce results that support their overall political, social, or economic goals in ways such as selecting small favorable data sets and not incorporating larger unfavorable data sets. However, he does little to show any actual "bias" in the case at hand, other than to voice this Wikiquote. It is also not clear to me whether he considers EVAL_COMP to itself be a meta-analysis (if he even knows it exists), or what.
At the least, Dalke could explicitly state that he has no actual evidence of bias in the given case, but is merely talking in general. As it stands, his innuendo verges on the libelous, particularly with the word-association from the previous Wikiquote.
I reiterate my comment concerning isolated pawns (see 4.2.3 of the RECAP) --- what exactly should be filtered from this that was not [the definition of "isolated" was filtered]?
The ICGA investigation found that (among other things, I might stress) various Rybka versions used a collection of evaluation features that was substantially similar to the collection used by Fruit (particularly in the collative aspects). This was determined via comparing a number of engines in a way that can be replicated, rebutted, or extended (if desired) with the result being the Rybka/Fruit overlap was an outlier of more than 5 standard deviations (in a group of 30 comparisons).
Furthermore, the Panel concluded that said choice of evaluation features was something that involved a notable amount of creativity, and thus Rybka's "originality" as per Rule #2 was found to be lacking. Rajlich chose not to defend himself against this (or any) charge, and the Board accepted it as valid. Some other pieces of evidence are enumerated in the RECAP pdf (Section 4) and in my Riis rebuttal (section 3.4).
Though the Panel did not adopt it themselves, I had argued that evaluation features could be considered as analogous to the plot of a book, and subject to "protection" in a similar manner. In this regard, one can note that computer programs are considered (in all relevant jurisdictions, it seems) to be literary works.
Finally, I find odd (to say the least) that Dalke contacted Schröder, but neither me, nor Zach, nor the ICGA, before passing his summary conclusion.
Schröder is putting words in my mouth. I said he is "indeed quite adept at disassembly". I see no reason to think, for instance, that Dalke knows much about computer chess (which becomes relevant when discussing non-literal elements). He also seems to be unaware of EVAL_COMP, and appears to take RYBKA_FRUIT (which was a preliminary listing of all possible evidence) as normative (whereas I would prefer the RECAP). I'm also not sure that he's read (say) the ICGA Secretariat Report.Rebel wrote:[Dalke] also is recognized by Rybka investigator Watkins as an expert
Onto his writing:
I feel that Rebel has cut something that preceded this?Andrew Dalke wrote:Indeed,
I don't know where this argument was made. The argument was that Rybka's use of bitboards (pace Fruit) was irrelevant to the discussion, as a higher level of abstraction was used in the comparison.They argue that those changes are mechanical transformations of the Fruit implementation, and therefore not a new implementation of the uncopywriteable algorithms expressed in Fruit but a derivative work in the copyright sense.
This is again a distortion and/or minimisation of the evidence. The first sentence appears to describe the PST evidence. The second can apply to various parts, but most notably to root search. Dalke continues that "While this was enough to convince the judges..." -- but he has omitted the bulk of evidence in his short dismissal.They have instead gone up to a higher level of abstraction shown that the code in Fruit, with different input parameters than the Fruit defaults, can generate numbers which after post-processing match numbers used in Rybka. They have stated that the order of certain actions, where the order should be arbitrary, is consistent between the two programs.
It seems to me that the "Rybka investigators" did do this. For instance, EVAL_COMP contained a variety of open-source engines that were available in 2005. Additionally, as another example, the similarities in root search were contrasted to Phalanx and others (this is already in RYBKA_FRUIT).[...] the comparison method should be validated by applying it to other programs which use the same algorithmic approach as Fruit and which are known to not have a shared copyright history. The Rybka investigators have failed to do this.
Dalke then goes on about clean rooms for about 5 paragraphs, but I don't think anything he says is relevant. He seems to only think there is some "algorithmic influence" from Fruit in Rybka, whereas the investigation concluded there was specific creative content derived from Fruit in Rybka. This is (in particular) where his analogy to operating systems fails, where non-literal elements are mostly nonexistent.
Dalke then suggests (sotto voce) that someone who does a meta-analysis frequently has "an economic, social, or political agenda such as the passage or defeat of legislation." He then continues his quotation from Wikipedia with the favored authors may themselves be biased or paid to produce results that support their overall political, social, or economic goals in ways such as selecting small favorable data sets and not incorporating larger unfavorable data sets. However, he does little to show any actual "bias" in the case at hand, other than to voice this Wikiquote. It is also not clear to me whether he considers EVAL_COMP to itself be a meta-analysis (if he even knows it exists), or what.
Again Dalke is misinformed. For instance, the EVAL_COMP procedure was discussed before it went into operation. He is correct that there was no consultation as to how extreme the Rybka/Fruit overlap needed to be to be "infringing", but the clarity of the end result proved this superfluous. Also, the Panel did discuss a number of other items, and for some of these the Fruit/Rybka similarity was found to be either unclear and/or of minor relevance. As for the EVAL_COMP analysis itself, it is open for anyone to review or critique, so I see no reason to jockey the "bias" angle.Andrew Dalke wrote:There are ways to help offset these problems [with meta-analysis]. For example, all comparison methods should be reported before doing the analysis, along with the definition of what "infringing" means for that case. Methods which fail to report similarity must be recorded. All participants must state possible sources of bias, and the method for selecting the participants must also be published. This was not done. [...]
If the results are "very much cherry-picked", it should be easy for someone (say) to re-do the EVAL_COMP construction (with other engines, if desired), and show that the Fruit/Rybka datapoint is not extraordinary. Instead, all we get is a continual chattering along the lines that there might be some bias somewhere, or if the analysis were re-done in such-and-such manner, maybe the result might be different; then again, Internet discussions are not exactly known for a proper transfer of evidentiary burden.Of course, if there was strong evidence for copyright infringement then a careful synthesis of the evidence would not be needed, but that was not the case here. Instead, the results seem very much cherry-picked.
At the least, Dalke could explicitly state that he has no actual evidence of bias in the given case, but is merely talking in general. As it stands, his innuendo verges on the libelous, particularly with the word-association from the previous Wikiquote.
Again Dalke's comment seems almost non sequitur to me, unless one assumes that he is ignorant of EVAL_COMP. As far as I can tell, EVAL_COMP exactly showed that non-infringing chess programs have much less similarity in evaluation features. He claims that there was "no appropriate filtration", but really doesn't say what this means. The EVAL_COMP methodology of comparing among a pool of engines formed a natural way to determine whether a given element was so common that it should be ignored (that is, filtered); EVAL_COMP found (in general) that few things should be filtered, for most engines differed quite notably in their various aspects.They claim to use the abstraction-filtration-comparison test to determine substantial similarity, but without the appropriate filtration. At each of the structural levels they fail to show that the discovery methods are not producing false positives, and they fail to demonstrate that the similarity level is greater than would be expected from a non-infringing chess program implementing the idea at the same structural level.
I reiterate my comment concerning isolated pawns (see 4.2.3 of the RECAP) --- what exactly should be filtered from this that was not [the definition of "isolated" was filtered]?
Again I will state that it is not clear to me that Dalke (as he is not a computer chess expert) realises that the evaluation function of a computer chess program is not purely "algorithmic", but also contains a significant quantity of creative content in its design.EVAL_COMP wrote:Fruit 2.1, Rybka 1.0 Beta, and Rybka 2.3.2a all give a penalty for an isolated pawn that depends on whether the file is half-open or closed [and make no other consideration].
Crafty 19.0 counts the number of isolated pawns, and the subcount of those on open files, and then applies array-based scores to these.
Phalanx XXII gives a file-based penalty, and then adjusts the score based upon the number of knights the opponent has, the number of bishops we have, and whether an opposing rook attacks it. There is then a correction if an isolated pawn is a ``ram'', that is, blocked by an enemy pawn face-to-face, and also a doubled-and-isolated penalty.
Pepito 1.59 has a file-based array for penalties, though the contents are constant except for the rook files. There is also a further penalty for multiple isolani.
Faile 1.4 penalises isolated pawns by a constant amount, with half-open files penalised more (same as Fruit/Rybka).
RESP 0.19 also penalises isolated pawns by a constant amount, and gives an additional penalty to isolated pawns that are doubled.
EXchess also gives a constant penalty for isolated pawns, and further stores it in a king/queenside defect count.
Change Fruit to X and Rybka to Y, and this could be a template statement... However, I would argue that it is still not quite correctly used here, as "evaluation features" (which is where AFC was used) is not that near the "highest level" when comparing computer chess programs.Andrew Dalke wrote:The similarity between Fruit and Rybka is strongest at the highest level of the analysis, but the abstraction-filtration-comparison test acknowledges that at a high enough level there's no copyright protection. This is due to the merger doctrine.
Again I find this to be generic mumbo-jumbo, not specifically Fruit/Rybka related.Copyright law already acknowledges that at higher levels there's no copyright infringement because it's different expressions of a common idea. Hence the statement "high-level functionality is always equivalent in these cases" is meaningless unless it's established that this level is not high enough.
The ICGA investigation found that (among other things, I might stress) various Rybka versions used a collection of evaluation features that was substantially similar to the collection used by Fruit (particularly in the collative aspects). This was determined via comparing a number of engines in a way that can be replicated, rebutted, or extended (if desired) with the result being the Rybka/Fruit overlap was an outlier of more than 5 standard deviations (in a group of 30 comparisons).
Furthermore, the Panel concluded that said choice of evaluation features was something that involved a notable amount of creativity, and thus Rybka's "originality" as per Rule #2 was found to be lacking. Rajlich chose not to defend himself against this (or any) charge, and the Board accepted it as valid. Some other pieces of evidence are enumerated in the RECAP pdf (Section 4) and in my Riis rebuttal (section 3.4).
Though the Panel did not adopt it themselves, I had argued that evaluation features could be considered as analogous to the plot of a book, and subject to "protection" in a similar manner. In this regard, one can note that computer programs are considered (in all relevant jurisdictions, it seems) to be literary works.
Finally, I find odd (to say the least) that Dalke contacted Schröder, but neither me, nor Zach, nor the ICGA, before passing his summary conclusion.