What do you folks make of this ?

Chris Whittington · Post by **Chris Whittington** » Thu Jul 01, 2010 1:10 pm

Rebel wrote:
Chris Whittington wrote: Well, I used the technique in 1980-something in Z80 assembler to jump out of the search, all you need to do then is reset the stack pointer and you're fine, back at the tree root. I changed to a more 'acceptable' method of unwinding back up the tree after staff programmers made a huge fuss about how uncompliant the technique was. But, Oxford Softworks later licensed a small chess program source for porting to various small platforms and I was surprised to see the programmer using setjmp (ie jump out of the search and reset vital pointers), told him to get 'proper(!)' and he refused, claiming it was just fine. This was a highly educated and qualified university guy who wrote masses of complex stuff in a bunch of fields and was highly reliable for producing bug free code, fast.

setjmp is way more common that Bob might want to think, it may offend purists but no way can it be suggested it is so rare that dual use implies plagiarism.

Curiously, in answer to someone recently, I read that Bob postulated setjmp as the most incriminating piece of code in Rybka!!
I can imagine the use of "setjmp" in a chess program. It's excellent for recursive stuff. Like you I used it in my early 6502 assembler days to save ROM space. Just push/pop the stack, there you go, wonderful.... If Bob really said the above bold he better retract. I know another programmer who also used it, maybe still does. That makes 3 already.

Ed

Sven Schule asked:
But since you know the Zach examples, which of these points would serve "best" from your viewpoint to prove that code was literally copied from Fruit 2.1 to Rybka 1.0 beta? Just one example is sufficient in the beginning.

Bob Hyatt replied:
The first one that jumped out at me when we started the process was the code segment containing the setjmp()/longjmp() construct. You can find references to this in the past. It is an unusual way to unwind a search that is not thread-happy, and invites very subtle bugs. The usage was identical in both programs with the surrounding code. There were others.

Sentinel · Post by **Sentinel** » Thu Jul 01, 2010 2:06 pm

Chris Whittington wrote:Bob Hyatt replied:
The first one that jumped out at me when we started the process was the code segment containing the setjmp()/longjmp() construct. You can find references to this in the past. It is an unusual way to unwind a search that is not thread-happy, and invites very subtle bugs. The usage was identical in both programs with the surrounding code. There were others.

You can't just ignore bolded sentence. It is important. It suggest not only that idea of (setjmp()/longjmp()) is used but the way it is used is identical, meaning surrounding code is verbatim copied.

Michel Van den Bergh · Post by **Michel Van den Bergh** » Thu Jul 01, 2010 2:40 pm

The first one that jumped out at me when we started the process was the code segment containing the setjmp()/longjmp() construct. You can find references to this in the past. It is an unusual way to unwind a search that is not thread-happy, and invites very subtle bugs.

I am curious why this would not be thread safe. And why you consider it unnatural.

Chris Whittington · Post by **Chris Whittington** » Thu Jul 01, 2010 5:21 pm

Michel Van den Bergh wrote:
The first one that jumped out at me when we started the process was the code segment containing the setjmp()/longjmp() construct. You can find references to this in the past. It is an unusual way to unwind a search that is not thread-happy, and invites very subtle bugs.
I am curious why this would not be thread safe. And why you consider it unnatural.

Just to be helpful, here's the User Guide to setjmp()/longjmp()
it's fairly specific about what to do in that using it wrongly, or without consulting the user guide, is likely to lead to crash and burn.

It is NOT unusual for the purpose of jumping out of search and returning to the root. Ed used it, I used a more primitive version of it until I got hassled by support programmers, one of my programmers used it and refused to stop using it. Bob really can't claim use of setjmp() means copying, or that the code around it means copying, because the code around it is quite likely forced anyway.

setjmp
Home » Library Reference » Reference » setjmp

Summary
#include <setjmp.h>

int setjmp (
jmp_buf env); /* current environment */
Description
The setjmp function saves the current state of the CPU in env. The state may be restored by a subsequent call to the longjmp function. When used together, the setjmp and longjmp functions provide you with a way to execute a non-local goto.

A call to the setjmp function saves the current instruction address as well as other CPU registers. A subsequent call to the longjmp function restores the instruction pointer and registers, and execution resumes at the point just after the setjmp call.

Local variables and function arguments are restored only if declared with the volatile attribute.

Return Value
The setjmp function returns a value of 0 when the current state of the CPU has been copied to env. A non-zero value indicates that the longjmp function was executed to return to the setjmp function call. In such a case, the return value is the value passed to the longjmp function.

See Also
longjmp

Example
#include <setjmp.h>
#include <stdio.h> /* for printf */

jmp_buf env; /* jump environment (must be global) */
bit error_flag;

void trigger (void) {
.
.
.
/* processing code here */
.
.
.
if (error_flag != 0) {
longjmp (env, 1); /* return 1 to setjmp */
}
.
.
.
}

void recover (void) {
/* recovery code here */
}

void tst_longjmp (void) {
.
.
.
if (setjmp (env) != 0) { /* setjmp returns a 0 */
printf ("LONGJMP called\n");
recover ();
}

else {
printf ("SETJMP called\n");

error_flag = 1; /* force an error */

trigger ();
}
}

longjmp
Home » Library Reference » Reference » longjmp

Summary
#include <setjmp.h>

void longjmp (
jmp_buf env, /* environment to restore */
int retval); /* return value */
Description
The longjmp function restores the state which was previously stored in env by the setjmp function. The retval argument specifies the value to return from the setjmp function.

The longjmp and setjmp functions may be used to execute a non-local goto. They are usually utilized to pass control to an error recovery routine.

Local variables and function arguments are restored only if declared with the volatile attribute.

Return Value
None.

See Also
setjmp

Example
#include <setjmp.h>
#include <stdio.h> /* for printf */

jmp_buf env; /* jump environment (must be global) */
bit error_flag;

void trigger (void) {
.
.
.
/* processing code here */
.
.
.
if (error_flag != 0) {
longjmp (env, 1); /* return 1 to setjmp */
}
.
.
.
}

void recover (void) {
/* recovery code here */
}

void tst_longjmp (void) {
.
.
.
if (setjmp (env) != 0) { /* setjmp returns a 0 */
printf ("LONGJMP called\n");
recover ();
}

else {
printf ("SETJMP called\n");

error_flag = 1; /* force an error */

trigger ();
}
}

Chris Whittington · Post by **Chris Whittington** » Thu Jul 01, 2010 5:31 pm

Sentinel wrote:
Chris Whittington wrote:Bob Hyatt replied:
The first one that jumped out at me when we started the process was the code segment containing the setjmp()/longjmp() construct. You can find references to this in the past. It is an unusual way to unwind a search that is not thread-happy, and invites very subtle bugs. The usage was identical in both programs with the surrounding code. There were others.
You can't just ignore bolded sentence. It is important. It suggest not only that idea of (setjmp()/longjmp()) is used but the way it is used is identical, meaning surrounding code is verbatim copied.

As far as I know the accusers have not published the surrounding code, if they do, we'll take a look. In any case setjmp/longjmp are described in the user guide and their usage has to be carefully done (although it is not complex), else crash and burn - I'ld guess that most users of the functions have similar code surrounding and I doubt code surrounding setjmp/longjmp is copyright anybody in that most usages are going to be very similar, by force.

Bob's argument btw was that mere usage of these allegedly rare and unsound (according to Bob) routines was enough to suspect plagiarism, on his another-nail-in-the-coffin, quantity theory, despite weakness in the quality. As Lenin once famously said, and Bob plagiarises in a functionally equivalent mode, quantity has a quality all of its own. Did Bob read Lenin? Doubtful. Two great minds come up with the same idea?! Well it happens, and Bob is surely no clone of Lenin

hyatt · Post by **hyatt** » Thu Jul 01, 2010 5:42 pm

BB+ wrote:
Well, in a sense you didn't need to say it because it came across through reading.
You are assuming that the average person read past the first page.

I largely agree with what you say about the ZW analysis.

Here is a list that I came up with (again I rely partially on ZW) of "suspicious" things in Rybka:
* Re-use of exact same File/Rank/Line arrays in PST values (as opposed to an "idea" where statics would be built up in this way, but with different numbers)
* Time management and UCI parsing, particularly the "0.0" appearance
* Copying of the position at the top of the search (Fruit actually copies it back at every iteration), which is pointless as-is in Rybka (it is copied back after the search, again for no reason). Both search 4 ply when a move is forced. [You can also include setjmp under this "Search Start" heading if you like, and I haven't checked whether Rybka actually copies the position, as opposed to Strelka].
* Hash entries: more differences than similarities perhaps, but both start with the same "initial segment" (lock, move, depth, date). The same is true for pawn hash (the size of the entries is not even the same in that case). [Strelka is annoying here, as it does not preserve the Rybka data structures in all cases].
* Great similarity in evaluation. Here the "ideas" concept comes into play. Specific things could be the 1:2:4 weighting of minor:rook:queen (why not 3:5:9?) and the identical minutiae with the DrawBishopFlag -- to offset the former, Fruit has a linear interpolation across phases, while Rybka's is more complicated.
* The use of 10, 30, 60, 100 weighting in passed pawns. If this were a one-off, I could believe the recurrence of this numerology was accidental.

I would think that if you use setjmp()/longjmp() as a way to "back out" of the search, you would be forced to copy the position back, since it is global and you would not know what state it was in after that abrupt search termination at an unknown point. Which highlights another similarity, using a very ugly mechanism that most avoid for that very reason (setjmp/longjmp in the first place). Been teaching for a long time, having students (in AI class) writing othello (and other) game-playing programs using alpha/beta, and to date I have never had one program that used this mechanism to exit the search. It is rare, indeed.

hyatt · Post by **hyatt** » Thu Jul 01, 2010 5:54 pm

Rebel wrote:Allow me some remarks. I deliberately take the Vas=innocent position for the sake of the discussion.

BB+ wrote: Here is a list that I came up with (again I rely partially on ZW) of "suspicious" things in Rybka:
* Re-use of exact same File/Rank/Line arrays in PST values (as opposed to an "idea" where statics would be built up in this way, but with different numbers)
One can only add the word suspicious if one have checked more programs (say 10-15) and found no such similarities. Perhaps things like these are common in many chess programs. Have Bob, Zach, you, checked 10-15 programs for its absence?

Not 10-15, but > 5 in my case. It does take a little work to compare, because some (stockfish/glaurung for example) don't use pawn=100, so you have to scale values. But I have looked at several when I was tuning mine to see if what I was using was missing something important. Everyone has a trend. Knights toward center, away from edge. But, except for cases where I can show that _my_ values were copied by another program, I never found anything near exact copies. Some have certainly "borrowed" my numbers since they know that we vetted them with lots of games in our cluster testing. But not fruit or rybka 1 as they came earlier.

* Time management and UCI parsing, particularly the "0.0" appearance
The 0.0 case is indeed suspicious. UCI parsing: perhaps its code is public domain. Fabien took it and so did Vas. Has this option been researched by Zach, Bob and you?

When someone says "There is zero fruit code in Rybka 1" I take the common definition of "zero". 0.000, not "a little" or "not very much".

* Copying of the position at the top of the search (Fruit actually copies it back at every iteration), which is pointless as-is in Rybka (it is copied back after the search, again for no reason). Both search 4 ply when a move is forced. [You can also include setjmp under this "Search Start" heading if you like, and I haven't checked whether Rybka actually copies the position, as opposed to Strelka].
Mine does the same, copy the board before the search starts. It doesn't matter if that is pointless, there are many things in mine that are not in use, they are either remains of previous ideas and forgotten to remove or I leave them there on purpose for future ideas.

I'll try to look at the code later today, but this idea might be a necessary artifact caused by setjmp()/longjmp() as a way to "unwind" the search. If you just rip yourself out of the middle of a search, using a global chess position, you would have to have some way to restore the position to a sane state. I do not recall if this was the way Fruit worked (global position, vs copying it to a temp space to be used by the search (if one does copy/make then you don't need to unwind, if one has a global make/unmake mechanism, this is an issue. Most have seen the old performance discussions showing copy/make is too slow on a PC.

Mine searched forced moves 3 plies in the early days until I increased it to 5 in order to have a better move to ponder. I don't see the relevance, 4 is an excellent value also and I suspect most programs do it this way.

There are not many good choices here. I've generally used 4 myself, although I am not sure what I do today as that has not been modified in years.

* Great similarity in evaluation. Here the "ideas" concept comes into play. Specific things could be the 1:2:4 weighting of minor:rook:queen (why not 3:5:9?) and the identical minutiae with the DrawBishopFlag -- to offset the former, Fruit has a linear interpolation across phases, while Rybka's is more complicated.
Using Vas own words: I took many things. I don't see the relevance.

He did say "ideas" but not "code".

When I was going through the Fruit-eval I wrote down 2 ideas, code for trapped (white) bishop (h6/a6) and the code for freeing ones rook in Kg1<>Rh1 situations. It's perfectly legal to add these ideas to mine, release it without any breach of the GPL. Again, what's the relevance?

Ideas vs Code. This has been done in many programs. I remember starting the "craze" in 95 or so because I added this code, and started winning all sorts of games by leaving the h2/h7 or a2/a7 pawn undefended so that my opponent would take it. At the time, hardly any commercial program understood the idea, and they would defend those pawns immediately. Crafty would leave 'em hanging, then push the b or g pawn to trap the bishop and win the piece for two pawns, generally. So the idea is well-known today. But the way it is implemented varies all over the map. Some do it where they evaluate bishops in general, some do it at some "special case" place in their code where they check for several oddball conditions, etc.

* The use of 10, 30, 60, 100 weighting in passed pawns. If this were a one-off, I could believe the recurrence of this numerology was accidental.
Mine has similar values. It's not suspicious at all.

Note the use of the word "similar" rather than "exactly"... There are a lot of scores that match exactly.

Ed

hyatt · Post by **hyatt** » Thu Jul 01, 2010 6:03 pm

Chris Whittington wrote:
Rebel wrote:Allow me some remarks. I deliberately take the Vas=innocent position for the sake of the discussion.

BB+ wrote: Here is a list that I came up with (again I rely partially on ZW) of "suspicious" things in Rybka:
* Re-use of exact same File/Rank/Line arrays in PST values (as opposed to an "idea" where statics would be built up in this way, but with different numbers)
One can only add the word suspicious if one have checked more programs (say 10-15) and found no such similarities. Perhaps things like these are common in many chess programs. Have Bob, Zach, you, checked 10-15 programs for its absence?

* Time management and UCI parsing, particularly the "0.0" appearance
The 0.0 case is indeed suspicious. UCI parsing: perhaps its code is public domain. Fabien took it and so did Vas. Has this option been researched by Zach, Bob and you?

* Copying of the position at the top of the search (Fruit actually copies it back at every iteration), which is pointless as-is in Rybka (it is copied back after the search, again for no reason). Both search 4 ply when a move is forced. [You can also include setjmp under this "Search Start" heading if you like, and I haven't checked whether Rybka actually copies the position, as opposed to Strelka].
Mine does the same, copy the board before the search starts. It doesn't matter if that is pointless, there are many things in mine that are not in use, they are either remains of previous ideas and forgotten to remove or I leave them there on purpose for future ideas.

Mine searched forced moves 3 plies in the early days until I increased it to 5 in order to have a better move to ponder. I don't see the relevance, 4 is an excellent value also and I suspect most programs do it this way.

* Great similarity in evaluation. Here the "ideas" concept comes into play. Specific things could be the 1:2:4 weighting of minor:rook:queen (why not 3:5:9?) and the identical minutiae with the DrawBishopFlag -- to offset the former, Fruit has a linear interpolation across phases, while Rybka's is more complicated.
Using Vas own words: I took many things. I don't see the relevance.

When I was going through the Fruit-eval I wrote down 2 ideas, code for trapped (white) bishop (h6/a6) and the code for freeing ones rook in Kg1<>Rh1 situations. It's perfectly legal to add these ideas to mine, release it without any breach of the GPL. Again, what's the relevance?

* The use of 10, 30, 60, 100 weighting in passed pawns. If this were a one-off, I could believe the recurrence of this numerology was accidental.
Mine has similar values. It's not suspicious at all.

Ed
To add to this .... Bob has always made a meal out of the setjmp instruction in Rybka and Fruit, claiming nobody else does it.

Well, I used the technique in 1980-something in Z80 assembler to jump out of the search, all you need to do then is reset the stack pointer and you're fine, back at the tree root. I changed to a more 'acceptable' method of unwinding back up the tree after staff programmers made a huge fuss about how uncompliant the technique was. But, Oxford Softworks later licensed a small chess program source for porting to various small platforms and I was surprised to see the programmer using setjmp (ie jump out of the search and reset vital pointers), told him to get 'proper(!)' and he refused, claiming it was just fine. This was a highly educated and qualified university guy who wrote masses of complex stuff in a bunch of fields and was highly reliable for producing bug free code, fast.

Absolutely impossible, because setjmp() and longjmp() are _C_ things. I did things remarkably different in Cray Blitz. We used an iterated search so that when time ran out, we just exited the search function instantly. Doesn't work for recursive implementations, which I'd be willing to bet you didn't do in assembly language due to the inherent messiness. This is about how do you get out of a deep recursive stack of calls as quickly as possible. The setjmp()/longjmp() approach is a horrible solution. Global variables are in an unknown state and you have to clean them up and get them back to "sane". What about locks()? state of a file if you are writing a log? Board state assuming, logically, that no one does copy/make due to performance issues. Just because one person says "it is OK" certainly doesn't mean it is. I've seen lots of university CS faculty that write horrible code, and some that can't write code at all.

setjmp is way more common that Bob might want to think, it may offend purists but no way can it be suggested it is so rare that dual use implies plagiarism.

It is but one "nail in the coffin". By itself, I would agree. But taken as part of the "whole"? Not a chance.

Curiously, in answer to someone recently, I read that Bob postulated setjmp as the most incriminating piece of code in Rybka!!

No you didn't. You read that that was the _first_ thing that caught my eye and got me interested in looking deeper. It was such an unusual (and lousy) way of unwinding a search that it caught my eye when I saw it in Fruit and Strelka. And then it was found in Rybka 1. It was "a" red-flag. Not "the" red flag. Sort of like hearing about a robbery where someone stole a TV, then you see someone hustling down a nearby street with a large box on their shoulder, and you think "TV", "Box", that's suspicious and deserves a further look.

hyatt · Post by **hyatt** » Thu Jul 01, 2010 6:08 pm

Rebel wrote:
Chris Whittington wrote: Well, I used the technique in 1980-something in Z80 assembler to jump out of the search, all you need to do then is reset the stack pointer and you're fine, back at the tree root. I changed to a more 'acceptable' method of unwinding back up the tree after staff programmers made a huge fuss about how uncompliant the technique was. But, Oxford Softworks later licensed a small chess program source for porting to various small platforms and I was surprised to see the programmer using setjmp (ie jump out of the search and reset vital pointers), told him to get 'proper(!)' and he refused, claiming it was just fine. This was a highly educated and qualified university guy who wrote masses of complex stuff in a bunch of fields and was highly reliable for producing bug free code, fast.

setjmp is way more common that Bob might want to think, it may offend purists but no way can it be suggested it is so rare that dual use implies plagiarism.

Curiously, in answer to someone recently, I read that Bob postulated setjmp as the most incriminating piece of code in Rybka!!
I can imagine the use of "setjmp" in a chess program. It's excellent for recursive stuff. Like you I used it in my early 6502 assembler days to save ROM space. Just push/pop the stack, there you go, wonderful.... If Bob really said the above bold he better retract. I know another programmer who also used it, maybe still does. That makes 3 already.

Ed

First, what setjmp()/longjmp() does is to simply rip you out of the search, back to the point where you did the setjmp(), with the stack restored to that point. No global data is restored. No locks are cleared. So you have to do your own recovery after the fact. This does not apply to either assembly language nor to the old-style iterated search (as opposed to recursive search). In Cray Blitz, we could "return" from search at any instant we chose, because there was no call stack to deal with, we looped on the variable "ply" to advance/retreat through the tree.

What I am talking about is using setjmp()/longjmp() _explicitly_ in a "C" or "C++" program. Assembly language doesn't belong in the discussion. Did you write recursive assembly on the 6502? If so you could have made it 3-4 times faster by using a loop as we did in Cray Blitz. Even today, calls are bad on X86 compared to an iterated search, which has caused a few to use a loop rather than a recursive implementation. But assembly doesn't belong in this discussion. Look at the publicly-available C/C++ programs and grep for setjmp or longjmp to get a better feel, rather than going back to assembly language coding where things are (and were) far different in terms of methodology.

zwegner · Post by **zwegner** » Thu Jul 01, 2010 6:10 pm

Chris Whittington wrote:OK, it's true that Zach's report, as you say, under the microscope, has, shall we say, flaws, or preconceived guilt notions as I would call them. But I do get the message, behind the report there is a substantial body of similarities. However, it's a similarity of IDEAS rather than implentations of the ideas, and, if we ascribe to Vas good motives (difficult for you, I appreciate) to make a very strong program, all his own work, ultimately, via test bed or parallel approach - then we have evidence for that too.

I would disagree with "preconceived guilt notions". That's not what it is at all. I originally got involved in this after seeing the UCI parser, which was just too similar to be coincidence IMO. I then spent an awful lot of time looking over Fruit, Strelka, and eventually disassembling the vast majority of Rybka 1.0. I came to the conclusion, after seeing what I saw, that Rybka started its life as Fruit. Then I spent a lot more time producing the report. The report was intended to argue my opinion--I didn't start by trying to prove Vas guilty of something by any means necessary. Maybe some of the language is a bit too biased, and rewording some stuff would be productive. The main reason for that would be the climate of the community around the time it was written. The vast majority of people would never believe what I was saying, and I took a lot of heat for it. There's still a lot of people that refuse to believe the evidence (if they even bother to look at it). If you look at the evidence and come to a different conclusion though, I can respect that.

I would definitely not say that the similarities are just "ideas". What I would call implementation copying is everywhere, search, evaluation, UCI parser. The line between ideas and implementation is really quite blurry, as the R/F case shows and the R/I case just reinforced that. If you look at one small part of it, you can maybe say that it's just taking an idea, but IMO, looking at the big picture, there is just too much there to be dismissed as ideas. The 0.0 case proves that Vas copied and pasted some code--the question is just how much. And even if the amount of copying and pasting was just limited to this section, I think the other similarities (particularly the PST and passed pawn constants) are enough to say that what was done was ethically wrong.

OpenChess

OpenChess

What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?

Re: What do you folks make of this ?