The Evidence against Rybka

Code, algorithms, languages, construction...
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Tue Jul 26, 2011 6:13 am

I saw this a few days ago, but was too busy with Sydney duties to respond:
http://rybkaforum.net/cgi-bin/rybkaforu ... #pid354189
Trotsky wrote: But, for now, the BB "results" of
74% Rybka-Fruit "overlap"
54% Crafty-Fruit "overlap"
and around 30% overlap elsewhere
As noted by Adam Hair (one of the few readable posts in that thread, it seems), this applies a mix-and-match for the 54%, though I admit it would be easier to determine this if the ICGA material were better organised. I would also say the quoted bit understates the spread of the elsewhere-results with "around". In all events, the "1 in X" number should be of more interest than any raw percentages.
Trotsky wrote:When BB presented that paper, somebody should have said "er, but your results map to ELO, do they not? go away, do your homework again and come back with something better that isn't going to make us look stupid".
[...]
If anybody cares to put the program ELO's against BB's "scores" he will find high correlation.
I think this criticism underestimates the process of the ICGA Panel. In any event, Adam Hair interpreted the above criticism to mean not that "results map to Elo", but to Elo difference, that is, there should be a correlation between X% overlap and a margin of Y in rating (I'm not sure why AH didn't renormalise/rescale one axis or the other as part of a linear-regression data analysis, but I agree that any correlation is somewhat unnotable in the first place, particularly in comparison to the Rybka/Fruit outlier). However, my interpretation (also noted in passing by AH) is that the point here is that Rybka/Fruit are the strongest engines in the set, and everything else is much lower-rated -- with the idea being that engines should become more similar as they become stronger and/or their authors have more knowledge. A counterpoint to this latter claim [at least in its current state] would be R3, which is even stronger, yet would not have too great of an "evaluation feature" overlap with engines of interest. Similarly with Stockfish, if you wish to exit the Rybka family for comparison. .

Going back to the Panel, the Elo argument (as to rebutting EVAL_COMP) was partially/tangentially broached in Panel discussions, and essentially rejected. This was both because any correlation seemed to be weak at best, and also because Elo strength is not directly relevant for "originality" purposes. If Rajlich had made this argument, and (say) requested that a similarly-rated engine from 2005-6 submit its source code to be inspected, etc., I fully expect this would have been done -- but he chose not to dispute the issue. [Indeed, usually the accused has to make a specific defense, and can't rely on the sum-total of all possible defenses to be raised on his behalf].

I can also note that EVAL_COMP was only produced due to a desire to ensure that this "evaluation feature" evidence could be sufficiently quantified. Although it formed a large portion of the Fruit/Rybka evidence, the "evaluation feature overlap" was already accepted by many Panel members in its qualitative state, and "evaluation features" themselves were only one part of the total evidence presented. For instance, for "probative similarity" one can note the seemingly literally copied code in the search control and iterative deepening routines, mentioned respectively at the end of Section 6.3.2 and in Appendix A of the RYBKA_FRUIT document.

Trotsky wrote:It is quite impossible to say what the developmental process was, there is no proof that A=Fruit, any better than there is proof that A=developmental Rybka, and Vas kept a copy of Fruit and/or Crafty and/or anything else open on other screens at the final development stage.
Copyright law is based upon "substantial similarity" and has a rather low threshold for proof (particularly in Poland). It is rather agnostic as to what the "developmental process" is, though I agree that this could be raised as an affirmative defense [which Rajlich chose not to do in the ICGA process].

I can't find the post anymore, but I also saw one that mentioned that the ICGA process only looked at the evaluation features, but not their relative weightings [and perhaps mocked the Panel for this]. This aspect was discussed, and the general opinion was that evaluation features were already sufficient for non-originality. For the forthcoming legal process by Fabien/FSF, the question of the relative importance of evaluation features and their weightings will be discussed more, if nothing else as a "percentage" basis of copyright infringement. Here it could be noted that relative weightings, at least for purposes of optimising strength, can often be tuned via an automated (thus less creative) process. [Alan Sassler noted that one could try a similar methodology to construct evaluation features, though describing tics of various implementations is nontrivial, and I don't think anyone has made this workable in general -- and even if so, it seems quite unlikely that one would reproduce the Fruit evaluation features from any reasonable-sized set].

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Thu Jul 28, 2011 10:18 pm

Another difficulty with the Elo-differential comparison to "evaluation features" is that it tends to "trivialise" other parts of the program. For instance, if you rewrite an engine to make it faster (e.g., mailbox->bitboards, with 32->64 bit machines), you might gain 50+ Elo, but keep the same evaluation features. Or you might add SMP, etc. Another possibility is to change the search, the canonical example here being null move, which can add 100-200 Elo. Indeed, an obvious datapoint to adduce is that R232a has much the same eval as R1, but is notably stronger.

So my contention is that this criticism, like many others of its genre, doesn't hold water when analysed further -- and perhaps more important from the Panel standpoint, didn't really seem that likely in the first place. As indicated in other places, the Panel was willing to consider possible defences that seemed reasonable and/or likely, but left more tendentious arguments to be made by Rajlich himself.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Thu Jul 28, 2011 10:31 pm

The question of PST comparison has also been raised. My own opinion is that this is not that important for copyright issues (other than being "yet another" commonality), while the Panel never really discussed it for "originality", presumably as there was already a plethora of evidence against Rybka. However, I will briefly note the following:

Knight PST (Opening):
Fruit's can be computed from 2 free parameters, a specific formula (plus a correction for a8/h8), and two specific arrays of 4 and 8 entries.
Rybka's can be computed from 2 free parameters, the same specific formula (plus a correction for a8/h8), and the same two specific arrays of 4 and 8 entries.

From the standpoint of "information content", the similarities notably outnumber the differences. And as noted elsewhere, these "free parameters" are perhaps more subject to tuning by an automatic process than the other parts.

Bishop PST:
Fruit's can be computed from 3 free parameters (two of which are zeroed in the endgame), and specific formula with a specific array of 4 entries.
Rybka's can be computed from 3 free parameters (the same two of which are zeroed in the endgame), and the same specific formula and array of 4 entries.

Etc. I don't know of any pre-Fruit engines to which one can apply these statements.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Mon Aug 08, 2011 9:12 am

Trotsky wrote:1. The arrays are not copyrightable. They are too trivial and other programs, possibly all other programs, can be shown to use them, or some simple transform of them, or some minor variation of them since the start of computer chess forty years or so ago.
This is a bit like saying individual words in a book are "not copyrightable" -- yet the creative overall use of them is. Also, the codicils like "simple transform" and "minor variation" cannot be ignored in copyright -- if Rybka had made such "modifications", there would probably not be as much of a debate regarding the PST. [Actually, I'm not sure there really is a debate regarding PST, as the civil action against Rajlich probably won't mention it except as "yet another" example, and the ICGA process never really discussed it]. In the case of Fruit/Rybka PST, it seems that there are 8 distinctive "ramping" arrays that re-appear with exactly the same usage conditions.
Trotsky wrote:2. The PST tables in TSCP, for example, show distinct usage of very similar arrays. By simple visual observation. I imagine that TSCP PSTs could be linked algorithmically to both Rybka and Fruit PSTs, rather in the same way that the accusing group has tried to show that Rybka and Fruit PSTs can be linked.
I imagine the opposite. I don't see any easy way to link the TSCP PST to that of Fruit, at least in any way that has the same reduction of information complexity as Fruit/Rybka.
Trotsky wrote:Hyatt posted the TSCP knight table
int knight_pcsq[64] = {
-10, -10, -10, -10, -10, -10, -10, -10,
-10, 0, 0, 0, 0, 0, 0, -10,
-10, 0, 5, 5, 5, 5, 0, -10,
-10, 0, 5, 10, 10, 5, 0, -10,
-10, 0, 5, 10, 10, 5, 0, -10,
-10, 0, 5, 5, 5, 5, 0, -10,
-10, 0, 0, 0, 0, 0, 0, -10,
-10, -30, -10, -10, -10, -10, -30, -10

If we take the 5th rank and divide each element by 5, we get
-2, 0, 1, 2, 2, 1, 0, -2,

If we now subtract 1 from each element, we get
-3, -1, 0, 1, 1, 0, -1, -3, [...]
One factor here is you took a specific rank. My guess is that if I looked hard enough, I would find (via enough transforms and data mining) the array -3 -1 0 1 somewhere in many chess engines (like finding given words in two texts) . But the operative question is whether this is used the same way as in Fruit/Rybka versus TSCP. If Fruit/Rybka had also only used this to get the 4th/5th rank of knights, maybe there would be a point -- but Fruit/Rybka both use this "ramping" array in many different ways, and both use it the same places, and in the same ways. [And Fruit/Rybka don't use it all for knights as with TSCP, which makes the whole example rather nugatory, as, to repeat yet again, the main point is that Fruit/Rybka use the same "ramping" arrays in the same way].

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Tue Aug 16, 2011 9:02 pm

The question of the "minuteness" of various evidence against Rybka has been queried. Here is a relevant legal standard:
West Publishing Co. v. Edward Thompson Co. C.C., 169 F. 833, 854 wrote:To constitute an invasion of copyright it is not necessary that the whole of a work should be copied, nor even a large portion of it in form or substance, but that, if so much is taken that the value of the original is sensibly diminished, or the labors of the original author are substantially, to an injurious extent, appropriated by another, that is sufficient to constitute an infringement.
I would say that Fabien thinks his work was "appropriated by another" too substantially, which is why he requested the ICGA to investigate.

On another issue, regarding [the irrelevance of] what "process" might have been used to develop Rybka 1.0 Beta (be it copy/paste, transliteration, or some sort of Fruit-osmosis from going "forwards and backwards"), there is this (particularly the last sentence):
Edwards & Deutsch Lithographing Co. v. Boorman, 15 F. 2d 35 wrote:[...] One may copy from memory. It is not necessary to such act that the copied article be before him at the time. Impressions register in our memories, and it is difficult at times to tell what calls them up. If the thing covered by a copyright has become familiar to the mind's eye, and one produces it from memory and writes it down, he copies just the same, and this may be done without conscious plagiarism. In this case, in all the essentials of the thing copyrighted, similarity amounts to identity, and the evidence establishes infringement. http://scholar.google.com.au/scholar_ca ... i=scholarr

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: The Evidence against Rybka

Post by Chris Whittington » Tue Aug 16, 2011 9:20 pm

BB+ wrote: On another issue, regarding [the irrelevance of] what "process" might have been used to develop Rybka 1.0 Beta (be it copy/paste, transliteration, or some sort of Fruit-osmosis from going "forwards and backwards"), there is this (particularly the last sentence):
Edwards & Deutsch Lithographing Co. v. Boorman, 15 F. 2d 35 wrote:[...] One may copy from memory. It is not necessary to such act that the copied article be before him at the time. Impressions register in our memories, and it is difficult at times to tell what calls them up. If the thing covered by a copyright has become familiar to the mind's eye, and one produces it from memory and writes it down, he copies just the same, and this may be done without conscious plagiarism. In this case, in all the essentials of the thing copyrighted, similarity amounts to identity, and the evidence establishes infringement. http://scholar.google.com.au/scholar_ca ... i=scholarr
You just created, presumably inadvertently, the necessity to adopt an entirely different approach to "copyright infringement" of open source. I, as a chess program developer (once apon a time, thank god), am now penalised simply by having read it, something I am encouraged to do.

It's the prissy open source problem, you can read it but you can't commit it to memory. Ridiculous.

The ONLY way out of this mess, created heavily by you and your "good intentions" is a wholesale rethink and rewrite of copyright law, icga rules and general mindset to a state of general usage and freedom.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Tue Aug 16, 2011 9:53 pm

Chris Whittington wrote:You just created, presumably inadvertently, the necessity to adopt an entirely different approach to "copyright infringement" of open source. I, as a chess program developer (once apon a time, thank god), am now penalised simply by having read it, something I am encouraged to do.
If you read it, and then create from this knowledge something that is "substantially similar" (always the over-riding criterion, as says the last sentence quoted above), then indeed you have infringed its copyright. If you read it, and then use that knowledge in a way that is not "substantially similar" to the original, then you've done what open source expects. . Here "it" refers to a computer program, but note that I left it as "it", rather intentionally, as the argument is abstract.

Similarly, if you are a novelist, and you read someone's book, you might get some ideas for a new novel. If your rendition of these ideas is "substantially similar" to the book you read [or perhaps just a section of it], then you have infringed copyright. If you separate the idea/expression sufficiently well so that your novel isn't "substantially similar" to what you've read, then you've enjoyed the book in the manner that was intended (by the author/publisher), and created something original of your own to boot.

As I think I said in a different post, for some reason the existence of the "computer" in this programming picture has a tendency to warp common thinking -- but copyright law (for better or worse) classifies many types of computer programs to be literary works, rather than functional devices.

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: The Evidence against Rybka

Post by Chris Whittington » Tue Aug 16, 2011 10:08 pm

BB+ wrote:
Chris Whittington wrote:You just created, presumably inadvertently, the necessity to adopt an entirely different approach to "copyright infringement" of open source. I, as a chess program developer (once apon a time, thank god), am now penalised simply by having read it, something I am encouraged to do.
If you read it, and then create from this knowledge something that is "substantially similar" (always the over-riding criterion, as says the last sentence quoted above), then indeed you have infringed its copyright. If you read it, and then use that knowledge in a way that is not "substantially similar" to the original, then you've done what open source expects. . Here "it" refers to a computer program, but note that I left it as "it", rather intentionally, as the argument is abstract.

Similarly, if you are a novelist, and you read someone's book, you might get some ideas for a new novel. If your rendition of these ideas is "substantially similar" to the book you read [or perhaps just a section of it], then you have infringed copyright. If you separate the idea/expression sufficiently well so that your novel isn't "substantially similar" to what you've read, then you've enjoyed the book in the manner that was intended (by the author/publisher), and created something original of your own to boot.

As I think I said in a different post, for some reason the existence of the "computer" in this programming picture has a tendency to warp common thinking -- but copyright law (for better or worse) classifies many types of computer programs to be literary works, rather than functional devices.



How can I help but use it, for it is now in my memory, influencing me? Ah, of course, I will remember the academic gestapo will chase me forever if I dare even have those bitboard or PST thoughts.

So, Herr Albert Speer, that's a very nice thought crime you've created with your prissy open source, my reading it, and your GPL for babies nonsense.

I think there has to be another way. Don't you?

Jeremy Bernstein
Site Admin
Posts: 1226
Joined: Wed Jun 09, 2010 7:49 am
Real Name: Jeremy Bernstein
Location: Berlin, Germany
Contact:

Re: The Evidence against Rybka

Post by Jeremy Bernstein » Tue Aug 16, 2011 11:28 pm

Chris Whittington wrote:
BB+ wrote:
Chris Whittington wrote:You just created, presumably inadvertently, the necessity to adopt an entirely different approach to "copyright infringement" of open source. I, as a chess program developer (once apon a time, thank god), am now penalised simply by having read it, something I am encouraged to do.
If you read it, and then create from this knowledge something that is "substantially similar" (always the over-riding criterion, as says the last sentence quoted above), then indeed you have infringed its copyright. If you read it, and then use that knowledge in a way that is not "substantially similar" to the original, then you've done what open source expects. . Here "it" refers to a computer program, but note that I left it as "it", rather intentionally, as the argument is abstract.

Similarly, if you are a novelist, and you read someone's book, you might get some ideas for a new novel. If your rendition of these ideas is "substantially similar" to the book you read [or perhaps just a section of it], then you have infringed copyright. If you separate the idea/expression sufficiently well so that your novel isn't "substantially similar" to what you've read, then you've enjoyed the book in the manner that was intended (by the author/publisher), and created something original of your own to boot.

As I think I said in a different post, for some reason the existence of the "computer" in this programming picture has a tendency to warp common thinking -- but copyright law (for better or worse) classifies many types of computer programs to be literary works, rather than functional devices.



How can I help but use it, for it is now in my memory, influencing me? Ah, of course, I will remember the academic gestapo will chase me forever if I dare even have those bitboard or PST thoughts.

So, Herr Albert Speer, that's a very nice thought crime you've created with your prissy open source, my reading it, and your GPL for babies nonsense.

I think there has to be another way. Don't you?


And there I was wondering who would verify Godwin's Law first. :roll:

Jeremy

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: The Evidence against Rybka

Post by hyatt » Wed Aug 17, 2011 3:29 am

Chris Whittington wrote:
BB+ wrote: On another issue, regarding [the irrelevance of] what "process" might have been used to develop Rybka 1.0 Beta (be it copy/paste, transliteration, or some sort of Fruit-osmosis from going "forwards and backwards"), there is this (particularly the last sentence):
Edwards & Deutsch Lithographing Co. v. Boorman, 15 F. 2d 35 wrote:[...] One may copy from memory. It is not necessary to such act that the copied article be before him at the time. Impressions register in our memories, and it is difficult at times to tell what calls them up. If the thing covered by a copyright has become familiar to the mind's eye, and one produces it from memory and writes it down, he copies just the same, and this may be done without conscious plagiarism. In this case, in all the essentials of the thing copyrighted, similarity amounts to identity, and the evidence establishes infringement. http://scholar.google.com.au/scholar_ca ... i=scholarr
You just created, presumably inadvertently, the necessity to adopt an entirely different approach to "copyright infringement" of open source. I, as a chess program developer (once apon a time, thank god), am now penalised simply by having read it, something I am encouraged to do.

It's the prissy open source problem, you can read it but you can't commit it to memory. Ridiculous.

The ONLY way out of this mess, created heavily by you and your "good intentions" is a wholesale rethink and rewrite of copyright law, icga rules and general mindset to a state of general usage and freedom.

You act like this is new. My calculus teacher as a freshman in 1966 used (now) an old book "Elements of differential and integral calculus". Maybe 350 pages long. He could sit at his deski, prop his feet up, close his eyes, and read verbatim from any part of that book. Without opening his eyes. Someone would ask him about a trigonometric substitution he made mentally. He'd say "turn to page 327, look about 1/2 way down, and you will see <recited formula here>".

Do you really think the could then run off and typeset that stuff in his head, have it published as his own work, and get away with it? Of course not. I have looked at many program sources over the years. Chess programs specifically. And I would occasionally notice something interesting (say in the eval). And I'd look to see what it was evaluating and how. And I could then, at some point, go write something to evaluate that same idea, and most of the time would also think of new things to include, or better ways of making it work, so that it ended up as "yes, we evaluate feature XXX" but when you look at the code, about all you could say was they were both evaluating king safety or something equally vague, because the two programs did not express the idea in the same way. In fact, often the "ideas" were similar in topic only, not in any form of expression, order, things tested, etc.

One doesn't have to emulate precisely an abstract idea. If all you can do is look at code and copy it, or look at and study code until you know exactly how it works and then go write something that works exactly the same way, I'd hardly call that person "an original genius." I'd just say "he has a good memory, nothing more..."

The idea of studying a program to get an overall feel for what it does is perfectly reasonable. If you are wondering if your implementation of an idea is flawed and you look at another program to compare, that seems reasonable, if you use the "peek" to fix your code, as opposed to "lifting his code outright."

This is right out of software engineering 101. You start at a very high level, and figure out what it is you are supposed to write, in very abstract and high-level (non-specific) terms using the language of the user. Then from that you develop specifications that are at a very high-level, no code/pseudo-code/flow-charts. You iron those out with the end-user. No code at all yet. Then you do a top-level design, giving an overall look/feel to how the final program will look, but very high-level, lots of black-boxes, etc. It is only when you get down to the (sometimes called) architectural design where you start stating what is done, how it is done, and what order it is done in, that you begin to take too much from another program, in this context... For example, I want to evaluate weak pawns. Let me look at XX to see what he does.

OK, a weak pawn is a pawn that can't easily (or ever) be defended by a pawn. This could be isolated, or backward, or even a pawn duo where one of the duo is on an open file and can't safely advance because of enemy pawns. That is all abstract and taking that kind of idea seems perfectly valid. But when you dig deeper to see how he determines whether a pawn can safely be defended by another, which is a pretty "dynamic question" since pawns can move or not depending on lots of things. And digging in to that level means you are now copying the essence of the implementation, rather than the abstract idea of "a weak pawn."

Seems perfectly reasonable to me as we (the group of programmers competing in these events) have always agreed that copying/using ideas presented by others was just fine, but not using/copying implementation details at a low level (such as source code).
Last edited by hyatt on Wed Aug 17, 2011 3:37 am, edited 1 time in total.

Post Reply