On Dalke

Code, algorithms, languages, construction...
Post Reply
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

On Dalke

Post by BB+ » Sun Feb 19, 2012 11:09 pm

Since Dalke seemed a bit unaware of many important facets of the Rybka case (and Schröder seems to have done nothing to rectify this), I felt it wise not to comment previously. However, Riis/ChessBase have now promoted his writings, and thus I shall rebut them.
Rebel wrote:[Dalke] also is recognized by Rybka investigator Watkins as an expert
Schröder is putting words in my mouth. I said he is "indeed quite adept at disassembly". I see no reason to think, for instance, that Dalke knows much about computer chess (which becomes relevant when discussing non-literal elements). He also seems to be unaware of EVAL_COMP, and appears to take RYBKA_FRUIT (which was a preliminary listing of all possible evidence) as normative (whereas I would prefer the RECAP). I'm also not sure that he's read (say) the ICGA Secretariat Report.

Onto his writing:
Andrew Dalke wrote:Indeed,
I feel that Rebel has cut something that preceded this?
They argue that those changes are mechanical transformations of the Fruit implementation, and therefore not a new implementation of the uncopywriteable algorithms expressed in Fruit but a derivative work in the copyright sense.
I don't know where this argument was made. The argument was that Rybka's use of bitboards (pace Fruit) was irrelevant to the discussion, as a higher level of abstraction was used in the comparison.
They have instead gone up to a higher level of abstraction shown that the code in Fruit, with different input parameters than the Fruit defaults, can generate numbers which after post-processing match numbers used in Rybka. They have stated that the order of certain actions, where the order should be arbitrary, is consistent between the two programs.
This is again a distortion and/or minimisation of the evidence. The first sentence appears to describe the PST evidence. The second can apply to various parts, but most notably to root search. Dalke continues that "While this was enough to convince the judges..." -- but he has omitted the bulk of evidence in his short dismissal.
[...] the comparison method should be validated by applying it to other programs which use the same algorithmic approach as Fruit and which are known to not have a shared copyright history. The Rybka investigators have failed to do this.
It seems to me that the "Rybka investigators" did do this. For instance, EVAL_COMP contained a variety of open-source engines that were available in 2005. Additionally, as another example, the similarities in root search were contrasted to Phalanx and others (this is already in RYBKA_FRUIT).

Dalke then goes on about clean rooms for about 5 paragraphs, but I don't think anything he says is relevant. He seems to only think there is some "algorithmic influence" from Fruit in Rybka, whereas the investigation concluded there was specific creative content derived from Fruit in Rybka. This is (in particular) where his analogy to operating systems fails, where non-literal elements are mostly nonexistent.

Dalke then suggests (sotto voce) that someone who does a meta-analysis frequently has "an economic, social, or political agenda such as the passage or defeat of legislation." He then continues his quotation from Wikipedia with the favored authors may themselves be biased or paid to produce results that support their overall political, social, or economic goals in ways such as selecting small favorable data sets and not incorporating larger unfavorable data sets. However, he does little to show any actual "bias" in the case at hand, other than to voice this Wikiquote. It is also not clear to me whether he considers EVAL_COMP to itself be a meta-analysis (if he even knows it exists), or what.
Andrew Dalke wrote:There are ways to help offset these problems [with meta-analysis]. For example, all comparison methods should be reported before doing the analysis, along with the definition of what "infringing" means for that case. Methods which fail to report similarity must be recorded. All participants must state possible sources of bias, and the method for selecting the participants must also be published. This was not done. [...]
Again Dalke is misinformed. For instance, the EVAL_COMP procedure was discussed before it went into operation. He is correct that there was no consultation as to how extreme the Rybka/Fruit overlap needed to be to be "infringing", but the clarity of the end result proved this superfluous. Also, the Panel did discuss a number of other items, and for some of these the Fruit/Rybka similarity was found to be either unclear and/or of minor relevance. As for the EVAL_COMP analysis itself, it is open for anyone to review or critique, so I see no reason to jockey the "bias" angle.
Of course, if there was strong evidence for copyright infringement then a careful synthesis of the evidence would not be needed, but that was not the case here. Instead, the results seem very much cherry-picked.
If the results are "very much cherry-picked", it should be easy for someone (say) to re-do the EVAL_COMP construction (with other engines, if desired), and show that the Fruit/Rybka datapoint is not extraordinary. Instead, all we get is a continual chattering along the lines that there might be some bias somewhere, or if the analysis were re-done in such-and-such manner, maybe the result might be different; then again, Internet discussions are not exactly known for a proper transfer of evidentiary burden.

At the least, Dalke could explicitly state that he has no actual evidence of bias in the given case, but is merely talking in general. As it stands, his innuendo verges on the libelous, particularly with the word-association from the previous Wikiquote.
They claim to use the abstraction-filtration-comparison test to determine substantial similarity, but without the appropriate filtration. At each of the structural levels they fail to show that the discovery methods are not producing false positives, and they fail to demonstrate that the similarity level is greater than would be expected from a non-infringing chess program implementing the idea at the same structural level.
Again Dalke's comment seems almost non sequitur to me, unless one assumes that he is ignorant of EVAL_COMP. As far as I can tell, EVAL_COMP exactly showed that non-infringing chess programs have much less similarity in evaluation features. He claims that there was "no appropriate filtration", but really doesn't say what this means. The EVAL_COMP methodology of comparing among a pool of engines formed a natural way to determine whether a given element was so common that it should be ignored (that is, filtered); EVAL_COMP found (in general) that few things should be filtered, for most engines differed quite notably in their various aspects.

I reiterate my comment concerning isolated pawns (see 4.2.3 of the RECAP) --- what exactly should be filtered from this that was not [the definition of "isolated" was filtered]?
EVAL_COMP wrote:Fruit 2.1, Rybka 1.0 Beta, and Rybka 2.3.2a all give a penalty for an isolated pawn that depends on whether the file is half-open or closed [and make no other consideration].

Crafty 19.0 counts the number of isolated pawns, and the subcount of those on open files, and then applies array-based scores to these.

Phalanx XXII gives a file-based penalty, and then adjusts the score based upon the number of knights the opponent has, the number of bishops we have, and whether an opposing rook attacks it. There is then a correction if an isolated pawn is a ``ram'', that is, blocked by an enemy pawn face-to-face, and also a doubled-and-isolated penalty.

Pepito 1.59 has a file-based array for penalties, though the contents are constant except for the rook files. There is also a further penalty for multiple isolani.

Faile 1.4 penalises isolated pawns by a constant amount, with half-open files penalised more (same as Fruit/Rybka).

RESP 0.19 also penalises isolated pawns by a constant amount, and gives an additional penalty to isolated pawns that are doubled.

EXchess also gives a constant penalty for isolated pawns, and further stores it in a king/queenside defect count.
Again I will state that it is not clear to me that Dalke (as he is not a computer chess expert) realises that the evaluation function of a computer chess program is not purely "algorithmic", but also contains a significant quantity of creative content in its design.
Andrew Dalke wrote:The similarity between Fruit and Rybka is strongest at the highest level of the analysis, but the abstraction-filtration-comparison test acknowledges that at a high enough level there's no copyright protection. This is due to the merger doctrine.
Change Fruit to X and Rybka to Y, and this could be a template statement... However, I would argue that it is still not quite correctly used here, as "evaluation features" (which is where AFC was used) is not that near the "highest level" when comparing computer chess programs.
Copyright law already acknowledges that at higher levels there's no copyright infringement because it's different expressions of a common idea. Hence the statement "high-level functionality is always equivalent in these cases" is meaningless unless it's established that this level is not high enough.
Again I find this to be generic mumbo-jumbo, not specifically Fruit/Rybka related.

The ICGA investigation found that (among other things, I might stress) various Rybka versions used a collection of evaluation features that was substantially similar to the collection used by Fruit (particularly in the collative aspects). This was determined via comparing a number of engines in a way that can be replicated, rebutted, or extended (if desired) with the result being the Rybka/Fruit overlap was an outlier of more than 5 standard deviations (in a group of 30 comparisons).

Furthermore, the Panel concluded that said choice of evaluation features was something that involved a notable amount of creativity, and thus Rybka's "originality" as per Rule #2 was found to be lacking. Rajlich chose not to defend himself against this (or any) charge, and the Board accepted it as valid. Some other pieces of evidence are enumerated in the RECAP pdf (Section 4) and in my Riis rebuttal (section 3.4).

Though the Panel did not adopt it themselves, I had argued that evaluation features could be considered as analogous to the plot of a book, and subject to "protection" in a similar manner. In this regard, one can note that computer programs are considered (in all relevant jurisdictions, it seems) to be literary works.

Finally, I find odd (to say the least) that Dalke contacted Schröder, but neither me, nor Zach, nor the ICGA, before passing his summary conclusion.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: On Dalke

Post by BB+ » Sun Feb 19, 2012 11:20 pm

On another note, Riis writes:
Chess programmers would be useful in pointing at which elements to use and which to filter in the ICGA (Watkins) comparison. However, the "chess programmers" were not involved in any filtering process, since none was done.
Riis is incorrect. As noted above, a filtering process was indeed done (and is mentioned in 4.2.3 of the RECAP, expanded therein in footnote 27). I have yet to decide how to address this with ChessBase. If they do not publish a formal retraction of his claim that no filtering process was done, I will consider this libelous. [Dalke is a bit more generous in merely saying "Rybka investigators" (and additionally probably has the burden of Schröder not informing him about EVAL_COMP), while Riis falsifies my actions personally].

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: On Dalke

Post by hyatt » Mon Feb 20, 2012 6:48 am

BB+ wrote:On another note, Riis writes:
Chess programmers would be useful in pointing at which elements to use and which to filter in the ICGA (Watkins) comparison. However, the "chess programmers" were not involved in any filtering process, since none was done.
Riis is incorrect. As noted above, a filtering process was indeed done (and is mentioned in 4.2.3 of the RECAP, expanded therein in footnote 27). I have yet to decide how to address this with ChessBase. If they do not publish a formal retraction of his claim that no filtering process was done, I will consider this libelous. [Dalke is a bit more generous in merely saying "Rybka investigators" (and additionally probably has the burden of Schröder not informing him about EVAL_COMP), while Riis falsifies my actions personally].

Note that this is likely another misrepresentation. I specifically said "in the Crafty 19.x / Rybka 1.6.1 examination, NO abstraction and NO filtration was needed. Because it was a dead-on copy." I'd bet he is carrying that statement right on into the fruit/Rybka discussion where it is irrelevant... It almost appears, to me, as if Dalke was "put up" to write this stuff much as Ed/Chris were feeding Friedel for the chessbase interview... I find it almost "funny" (not funny - ha-ha) that these guys skim over a LOT of evidence, multiple reports, source code, asm code, summaries, etc. but miss almost all the technical aspects.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: On Dalke

Post by BB+ » Mon Feb 20, 2012 8:10 pm

BB+ wrote:This is (in particular) where his analogy to operating systems fails, where non-literal elements are mostly nonexistent.
After discussing the issue with Dalke, he points out (for instance) that CA/Altai [dealing with an adapter for OSes] actually did consider non-literal elements (and indeed, that was the whole debate in that case). Perhaps I should have said:
[...]This is (in particular) where his analogy to operating systems fails -- for instance, CA/Altai [dealing with an OS adapter] found that elements which were both non-literal and non-filterable (or protectable) to be mostly nonexistent.
In any event, I claim that a computer chess program does have non-literal protectable elements, whereas CA/Altai found that ADAPTER essentially did not.

Dalke also notes that he just essentially ignored EVAL_COMP (though he was aware of it, it seems), as it was not of interest to him. His view is that copyright does not apply to the elements given therein. Furthermore, as he is not interested in "ICGA originality" (but only GPL/copyright), he found no reason to give it a viewing. He did not specify whether his claim of non-filtration was meant to apply to EVAL_COMP (as interpreted by Riis).

User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Re: On Dalke

Post by User923005 » Tue Feb 21, 2012 12:48 am

So Riis, Dalke, Schroder, van Kervinck, Whittington, Ballicora, etc. have no valid arguments whatsoever and have not made any point to make you reexamine your findings at all?

There is no source code available. Every line is a fiction, one of the infinite number of possible lines that can generate an equalent assembly code output.
You *have* proven that Rybka uses some Fruit algorithms.
You have not proven (nor can you possibly conclude given the inputs at hand) that Rybka has used Fruit source code.

I do not know if Vas has done something wrong. I used to think that you were objective, but now I am starting to doubt it.

In some sense, every chess program is in violation of rule 2. If (for instance) they use Alpha-Beta to speed the search, this is something that they did not invent and yet it makes their chess program play much better. I bet they did not credit the originators of alpha beta in their notes. Of course, this is silly, you might say and you would be right. But according to the wording of rule 2, it is a clear violation. You might insist that the moves played are different and so it is not a violation but Fruit and Rybka also play different moves.

I think it is worthwhile for you to examine the possibility that what has claimed to have been proven is not what has actually been proven.

IMO-YMMV

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: On Dalke

Post by BB+ » Tue Feb 21, 2012 2:05 am

So Riis, Dalke, Schroder, van Kervinck, Whittington, Ballicora, etc. have no valid arguments whatsoever
Nothing that affects any conclusion, particularly with the evaluation feature comparison to show that Rybka was not "original" in the sense meant by ICGA Rule #2. Riis mentions EVAL_COMP briefly in his Part III. See 3.2 of my response. To the best of knowledge, Ballicora has said nothing other than to imply that he doesn't like the methodology. MvK appears to think Rybka (2.3.2a) is derivative of Fruit in the EVAL_COMP sense, but disagrees on its ICGA applicability (for various reasons). I'd agree that CW has made some valid comments concerning EVAL_COMP, but nothing he has said is of sufficient magnitude (despite any histrionics he displays). For instance, on what I think is (now) his major contention, he says the filtration was faulty, but has yet (to my knowledge) to give an example of how he would filter isolated pawns. For Dalke, see below.
and have not made any point to make you reexamine your findings at all?
I have re-examined them almost every time something new pops up. For instance, I tracked down much about Belle/DT [including Rules from 1989/95] in response to MvK, even though I found the first half of his argument (GPL is implied consent) already to fail irretrievably. Dalke's argument seems to be simply that copyright (he is quite specific that he cares not about ICGA "originality") does not apply to lists of evaluation features. I contend that he misunderstands their effect in a chess program (in terms of the abstraction framework of CA/Altai, they seem commensurate to "parameter lists", but the abstraction-level weighting therein would differ greatly from that given by Davis [the expert] for ADAPTER), and am currently discussing this privately with him.

Most of other things that I read tend to debate the picayune (UCI parsing, maybe PST).
In some sense, every chess program is in violation of rule 2.
This would not be the ICGA sense of their rule. Riis seems to make the same error, in that he claims that "non-literal copying" is just some ICGA mumbo-jumbo, applicable at whim. Rather, to evince such copying, one thing the ICGA Investigation did was undertake a process that concluded there was "substantial similarity" (in an ICGA meaning) between Rybka and Fruit, the analysis having been modelled along the lines of the AFC Test. In some sense, you might as well argue that "non-literal copying" itself is an oxymoron (which some do, I expect), applicable at the whim of a court. The ICGA gave explicit and documented reasons for their conclusion of "non-literal copying", little of which has been touched by later events. Their conclusion is necessarily subjective, but is certainly not whimsical.

User923005
Posts: 616
Joined: Thu May 19, 2011 1:35 am

Re: On Dalke

Post by User923005 » Tue Feb 21, 2012 2:30 am

Their conclusion is subjective but your conclusion is not subjective? Interesting.

Non-literal copying --> this means nothing more than using the algorithm, if I understand correctly what is being claimed.
If it means something else, please spell out exactly what that is and how it differs from using the algorithm. In what way, exactly does "non-literal copying" {which apparently is a serious crime} differ from "using the same algorithm" {which, as everyone knows, is fine unless there is a patent}.

An algorithm, by the way, is a sequence of steps used to process data to arrive at a programming goal. These exact steps and the result of using these exact steps is not protected. Only the implementation is protected.

For example, every strong engine is now using LMR, an idea "stolen" from Fruit {generated} and Stockfish {perfected}. If you were to take the assembly of LMR from engine X and compare it to the LMR patch from engine Y, I think you will find that they are substantially similar. Enough, in fact, do decide that someone has copied this idea from someone else. And, indeed, they all have.

Are all of these engines in violation of rule 2, and if not, how is what they have done different from what Rybka has done?

Please, let me be clear. It is possible that Vas has done something wrong and I do not attempt to exhonerate him. However, it is crystal clear to me that the team trying to convict Vas has not shown what they claim to have shown. Either that, or I do not understand what they have claimed.

I think that the evaluation of Rybka is different from the evaluation of Fruit. I think that the search of Rybka is different from the search of Fruit. Definitely, there are shared algorithms.

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: On Dalke

Post by hyatt » Tue Feb 21, 2012 6:47 am

User923005 wrote:So Riis, Dalke, Schroder, van Kervinck, Whittington, Ballicora, etc. have no valid arguments whatsoever and have not made any point to make you reexamine your findings at all?

There is no source code available. Every line is a fiction, one of the infinite number of possible lines that can generate an equalent assembly code output.
You *have* proven that Rybka uses some Fruit algorithms.
You have not proven (nor can you possibly conclude given the inputs at hand) that Rybka has used Fruit source code.

I do not know if Vas has done something wrong. I used to think that you were objective, but now I am starting to doubt it.

In some sense, every chess program is in violation of rule 2. If (for instance) they use Alpha-Beta to speed the search, this is something that they did not invent and yet it makes their chess program play much better. I bet they did not credit the originators of alpha beta in their notes. Of course, this is silly, you might say and you would be right. But according to the wording of rule 2, it is a clear violation. You might insist that the moves played are different and so it is not a violation but Fruit and Rybka also play different moves.

I think it is worthwhile for you to examine the possibility that what has claimed to have been proven is not what has actually been proven.

IMO-YMMV

ICGA rule two does NOT apply to "ideas". Only to code, or a translation of said code. The ICGA doesn't ask about proper attribution for each idea used. It just requires that the program be original to that particular author, not a copy/modify version as we are seeing with all the ippolit/robolito derivatives...

Please cite the specific part of rule 2 you believe concerns ideas, particularly to those many of us that have entered ICGA events and discussed the rules many times as to what they really are intended to prevent...

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: On Dalke

Post by BB+ » Tue Feb 21, 2012 8:21 am

Their [the ICGA's] conclusion is subjective but your conclusion is not subjective? Interesting.
I'm not sure what conclusion of mine is meant, but EVAL_COMP was itself stated to be subjective (both on the wiki before the whole thing was instrumented, and in the document). If the issue were completely "objective", I'd hope I wouldn't have to spend so much time explaining it [as one usually does with subjective things] in forums. :lol:

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: On Dalke

Post by hyatt » Tue Feb 21, 2012 5:21 pm

Non-literal copying --> this means nothing more than using the algorithm, if I understand correctly what is being claimed.
If it means something else, please spell out exactly what that is and how it differs from using the algorithm. In what way, exactly does "non-literal copying" {which apparently is a serious crime} differ from "using the same algorithm" {which, as everyone knows, is fine unless there is a patent}.
this is easy. In this case, non-literal copying would be copying the fruit source for the eval, then modifying it to work with bitboards rather than mailbox, which requires substantial changes in some places, minor or no changes in others. It has been repeatedly pointed out that it is somewhat akin to translating a book from English to another language. The "copy" won't look the same, character by character or line by line, but the story will have the same plot, same characters, same events, and same ending. In the case of Rybka, he not only had to translate to bitboards, but he used a different base pawn value, so the numbers changed as well...

Post Reply