The Evidence against Rybka

Code, algorithms, languages, construction...
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Sat Jul 09, 2011 2:52 pm

Trotsky@rybkaforum.net wrote:I apologise for making this report at this late stage, but I was excluded from the panel deliberations and therefore only have access to the material now. [...]
Watkins data was used by him to make 1 to 1 comparisons. However it is also possible to make 1 to group comparisons for all the programs in the group. Just as a high score in the 1 to 1 comparisons is suggested by Watkins to indicate plagiarism between the two programs concerned, so a high score in the 1 to group comparison would indicate plagiarism within the group as a whole.
I don't think this one-to-group comparison really measures such a notion. It seems to me that the described technique is essentially an averaging process among the engine group, so it can't be a surprise that the results end up tending toward a whitewash. I also don't think that this "comparing to a group" (even when done in a more careful manner than simply taking maximal intra-group overlap) is much related philosophically to plagiarism.
Percentage plagiarism (corresponds to the 1 to 1 Watkins percentage headline figure of 74%, although I might get corrected on this)
I'd say the new percentages correspond to something a bit different. Speaking (loosely) in terms of graph theory, this new measurement determines the minimal distance from a given node to any node, while the original method determined the distance from a given node to a specific other one. I don't think there is any obvious interpretation of the numbers obtained, and I would certainly expect the resulting percentages to be larger when comparing to a group rather than a singleton. [On a pedantic note, the raw percentages don't matter anyway, only the probabilities that are derived from them]. If one were (say) to apply the Abstraction-Filtration-Comparison Test to a 1-to-group situation, I'd guess a significant amount of stuff would be evicted at the Filtration stage.

Another comment is that the resulting numbers need to be adjusted and/or re-interpreted for the size/span of the group. Simply making the group really big would tend to make everything up being 100% plagiarism [then again, maybe that's the point?!]. Putting this as a mathematical example, if I double the size of your data set by adding 8 new engines: the first of which copies Crafty for the first 6 features, then RESP for the 6 next features, etc.; the second of which copies RESP for the first 6 features, etc., then every engine would end up 100% plagiarised. The fact that this 1-to-group comparison collapses when adding in such "averaged" engines indicates to me that it is not too useful of a statistic for distinguishing engines and/or engine pairs. Contrariwise, the EVAL_COMP methodology is somewhat robust against such the addition of such averaged engines (at least until you throw in so many "(re-)averaged" engines that they dominate the analysis).

At a copyright level, individual features would likely not be "protected content", but a specific selection of them could be considered such. For such purposes I think it is clear that a 1-to-group analysis (using the "maximal overlap" metric) is not as useful as a 1-to-1 analysis, particularly when the group is large. Analogously, individual elements of a book plot are not (typically) subject to copyright, but the specific combining of said elements can often be so, to the extent that the said combination was "creative" (a subjective term of course) -- I would guess one could re-phrase the AFC test to analyse book plots if desired, first identifying layers of abstraction for the plot, etc.
Crafty 0.58
RESP 0.65
Ryb1 0.76
Phal 0.54
Fail 0.67
Fr21 0.81
Pepi 0.60
EX5b 0.64
Even with these issues about said method, I might point out that the "average plagiarising/plagiarised" level obtained here is 65%, with (other than Fruit and Rybka) only RESP and Faile of that size [another point: you probably need to adjust the 1-to-group comparison for engines like Faile that have few features]. Removing Fruit and Rybka [applying an outlier test], the other 6 engines total up to 61% mean with 4.5% standard deviation. So it would only be natural to investigate Rybka and Fruit more specifically, given that this 1-to-group analysis puts them at 3-4 sigma upon comparison to other 6 engines (admittedly a small sample). Something that measures peer-to-peer overlap rather than peer-to-group would then be a logical addition to the methodology.
[...] there appears to be a massive inter-group plagiarism in the evaluation function of the all selected programs. No one program stands out above any other program.
It's still not clear to me what exactly "inter-group plagiarism" really means [as any significant intra-group commonalities should be filtered out], while the above statistic notes that Rybka and Fruit do in fact stand out above the other 6 programs.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Sun Jul 10, 2011 11:25 pm

Two corrections:
BB+ wrote:the other 6 engines total up to 61% mean with 4.5% standard deviation.
The latter is wrong a factor of sqrt(6/5) (I forgot the rules of low-population statistics). It is still correct to say that 1-to-group numbers (whatever they mean) for Rybka and Fruit are 3-4 standard deviations above the [small] sample formed by the other 6 engines.
Percentage plagiarism (corresponds to the 1 to 1 Watkins percentage headline figure of 74%, although I might get corrected on this)
BB+ wrote:[another point: you probably need to adjust the 1-to-group comparison for engines like Faile that have few features]
Sorry I didn't realise this the first time around -- the new 1-to-group number seems to be "unidirectional", in that it doesn't measure the group-to-1 overlap. This concern was handled (somewhat crudely) in EVAL_COMP by averaging the feature counts of each pair A and B before computing the overlap percentage between them. I have no idea if this is reasonable here (maybe you want to take a weighted average, depending on the group size), but it warps the raw data as follows (counting 47 features for the "group"):

Code: Select all

Crafty 20.5 35 -> 20.5/(41)    50%
RESP   15.7 24 -> 15.7/(35.5)  44%
Ryb1   24.5 32 -> 24.5/(39.5)  62%
Phal   21.8 40 -> 21.8/(43.5)  50%
Fail    9.5 14 ->  9.5/(30.5)  31%
Fr21   23.7 29 -> 23.7/(38)    62%
Pepi   23.1 38 -> 23.1/(42.5)  54%
EX5b   15.5 24 -> 15.5/(35.5)  44%
I'm not sure I put much value on the final numbers (which seems to be more about how "light-weight" an eval is, at least outside Fruit/Rybka), but the principal objects of the ICGA investigation still have noticeably large values.

My opinion remains that the whole "plagiarism against a group" concept is misguided. The issue that has been brought up, namely that there is large amount of "general evaluation knowledge", was discussed (in surrogate, if not directly) in the Panel, and the consensus appeared to be that EVAL_COMP was sufficient to overcome any such objections. For instance, specific commonalities are more notable than general ones (note that the earlier EVAL_COMP versions, in Appendix D.2.3 following Section 3 of RYBKA_FRUIT, was more crude in this) with "originality", pairwise comparisons between engines are a better indicator of "plagiarism" [this being copying from mainly one other source] than comparisons against a group whole, and the Abstraction-Filtration-Comparison should filter out (for instance) "standard programming techniques" and/or things in public domain [second/third paragraphs below]:
Wikipedia wrote: Filtration

The second step is to remove from consideration aspects of the program which are not legally protectable by copyright. The analysis is done at each level of abstraction identified in the previous step. The court identifies three factors to consider during this step: elements dictated by efficiency, elements dictated by external factors, and elements taken from the public domain.[6][2]

The court explains that elements dictated by efficiency are removed from consideration based on the merger doctrine which states that a form of expression that is incidental to the idea can not be protected by copyright. In computer programs, concerns for efficiency may limit the possible ways to achieve a particular function, making a particular expression necessary to achieving the idea. In this case, the expression is not protected by copyright.[7]

Eliminating elements dictated by external factors is an application of the scènes à faire doctrine to computer programs. The doctrine holds that elements necessary for, or standard to, expression in some particular theme can not be protected by copyright.[8] Elements dictated by external factors may include hardware specifications, interoperability and compatibility requirements, design standards, demands of the market being served, and standard programming techniques.[9]

Finally, material that exists in the public domain can not be copyrighted and is also removed from the analysis.[2]
As such, something vague like "king safety" must either be broken down into component parts, or ignored completely. The former choice is preferred in general (when possible), and was carried out [to some extent] in EVAL_COMP. The 1-to-group paradigm seems me to be closer to the "ignore it completely" option. Again using book plots as an example: book A has its lead character as an Italian born in Florence but living in (somewhat populous) Lugano, who tagline is to complain about the price of espresso while sipping a latte; book B has its lead character as an Bolognese living in remote Cevio, who often complains about the price of espresso while gulping a mocha (or Moka). One can either try to break down the component parts (and here one could seemingly argue either way for just this character, so maybe the elements should be "quantified", and then the whole context in the book should be considered, etc.), or one can just say "almost every novel has a main character, and any specific elements are not new when compared to the group of novels (of this genre)", and thus ignore any "overlap" here.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Thu Jul 14, 2011 10:09 am

all that can be owned of commonly used material is the actual implementation of that material and since, on your own admission, the material can only be compared on a semantic basis (ie the implementation is different) -> case disappears in puff of blue smoke.
It's not clear to me whether this discussion is about copyright law or the ICGA process. Since the latter seems to be complete [as Rajlich appears content (though not happy) with the verdict], I will address the former, though the arguments are quite similar.

The phrase "commonly used material" needs to be considered more carefully (evaluation features are such in the large, but likely not in the specific), while the phrase "the actual implementation of that material" again needs more adjectives. For instance, Nimmer on Copyright explains how West Side Story can be considered to infringe the [nonexistent/expired] copyright of Romeo and Juliet via plot-copying, and here the "actual implementation[s]" are clearly different, going much beyond merely (say) a language translation [something more like mailbox->bitboard]. Poland is rather noted for its historically strong protection of copyright (unlikely the UK, from what I've been able to determine, but I could be wrong), so I expect something closer to the "large" concept of copyright to be applicable.

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: The Evidence against Rybka

Post by Chris Whittington » Thu Jul 14, 2011 3:39 pm

BB+ wrote:
all that can be owned of commonly used material is the actual implementation of that material and since, on your own admission, the material can only be compared on a semantic basis (ie the implementation is different) -> case disappears in puff of blue smoke.
It's not clear to me whether this discussion is about copyright law or the ICGA process. Since the latter seems to be complete [as Rajlich appears content (though not happy) with the verdict], I will address the former, though the arguments are quite similar.

The phrase "commonly used material" needs to be considered more carefully (evaluation features are such in the large, but likely not in the specific), while the phrase "the actual implementation of that material" again needs more adjectives. For instance, Nimmer on Copyright explains how West Side Story can be considered to infringe the [nonexistent/expired] copyright of Romeo and Juliet via plot-copying, and here the "actual implementation[s]" are clearly different, going much beyond merely (say) a language translation [something more like mailbox->bitboard]. Poland is rather noted for its historically strong protection of copyright (unlikely the UK, from what I've been able to determine, but I could be wrong), so I expect something closer to the "large" concept of copyright to be applicable.
I'm looking for the failure point in your document on eval feature comparison. Looks like I've not described it well enough since this cross posting response doesn't address the point identified.

Failure is in filtration stage.
We understand you did not compare at code to code sub function abstraction level because that would not work for your purposes. If you had done, due process says you would have used PD filtering at minimum.
So you went up an abstraction level and compared the meanings expressed by the code (which is also what took you from copy violation to plagiarism). As a perhaps unintended consequence your method/results showed substantial common usage at the sub function level. At this abstraction level PD filtering does not have too much meaning, we don't usually talk about PD and meaning in the same sentence. You needed to have found a parallel filtration method in the absence of PD filtering. I think that means you should have gone to common usage filtering. This has an intuitively correct feel about it because it factors in the concepts of uniqueness and originality. It has to be worse to plagiarise (used in your sense) original new unique stuff than to corresponding plagiarise old unoriginal and commonly used stuff. And, if stuff is old unoriginal and commonly used then the term plagiarise becomes difficult to use. Plagiarise minimax? Plagiarise LMR? Plagiarise null move?

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Thu Jul 14, 2011 4:33 pm

Chris Whittington wrote:I'm looking for the failure point in your document on eval feature comparison. Looks like I've not described it well enough since this cross posting response doesn't address the point identified.

Failure is in filtration stage.
We understand you did not compare at code to code sub function abstraction level because that would not work for your purposes. If you had done, due process says you would have used PD filtering at minimum.

So you went up an abstraction level and compared the meanings expressed by the code (which is also what took you from copy violation to plagiarism). As a perhaps unintended consequence your method/results showed substantial common usage at the sub function level. At this abstraction level PD filtering does not have too much meaning, we don't usually talk about PD and meaning in the same sentence. You needed to have found a parallel filtration method in the absence of PD filtering. I think that means you should have gone to common usage filtering. This has an intuitively correct feel about it because it factors in the concepts of uniqueness and originality. It has to be worse to plagiarise (used in your sense) original new unique stuff than to corresponding plagiarise old unoriginal and commonly used stuff. And, if stuff is old unoriginal and commonly used then the term plagiarise becomes difficult to use.
I'm not quite sure what you mean by "common usage" filtering. For instance, most programs consider isolated pawns. Some do quite similar things, while others differ a lot. How much should be filtered? I would argue that the rule of thumb would be that most things that are "adaptable" should not be filtered.
EVAL_COMP wrote:Fruit 2.1, Rybka 1.0 Beta, and Rybka 2.3.2a all give a penalty for an isolated pawn that depends on whether the file is half-open or closed. Crafty 19.0 counts the number of isolated pawns, and the subcount of those on open files, and then applies array-based scores to these. Phalanx XXII gives a file-based penalty, and then adjusts the score based upon the number of knights the opponent has, the number of bishops we have, and whether an opposing rook attacks it. There is then a correction if an isolated pawn is a ``ram'', that is, blocked by an enemy pawn face-to-face, and also a doubled-and-isolated penalty. Pepito 1.59 has a file-based array for penalties, though the contents are constant except for the rook files. There is also a further penalty for multiple isolani. Faile 1.4 penalises isolated pawns by a constant amount, with half-open files penalised more (same as Fruit/Rybka). RESP 0.19 also penalises isolated pawns by a constant amount, and gives an additional penalty to isolated pawns that are doubled. EXchess also gives a constant penalty for isolated pawns, and further stores it in a king/queenside defect count.
Just to get us moving toward specifics: what would you consider the "common usage" filtering for isolated pawns to be?

As for the more general "Filtration" step question, I copied the Wikipedia blurb about this 3 posts previous. The court identifies three factors to consider during this step: elements dictated by efficiency, elements dictated by external factors, and elements taken from the public domain. [...] Eliminating elements dictated by external factors is an application of the scènes à faire doctrine to computer programs. The doctrine holds that elements necessary for, or standard to, expression in some particular theme can not be protected by copyright. Elements dictated by external factors may include hardware specifications, interoperability and compatibility requirements, design standards, demands of the market being served, and standard programming techniques.
Chris Whittington wrote:Plagiarise minimax? Plagiarise LMR? Plagiarise null move?
Each of these are sufficiently broad to have multiple ways of realising them. For the latter two, here would be a more specific level in the abstraction scale (and there would be a few more with even more details):
*) Plagiarise: LMR/LMP [pruning] with having a pre-makemove condition involving "expected positional gain" plus score margin and depth (and movecount of course), then another such condition [with different score bounds] if a move was bad-SEE, and thirdly a post-makemove condition if the "expected positional gain" didn't show up -- with the "in-check" versions of such LMR/LMP again being plagiarised in a parallel fashion.
*) Plagiarise: recursive null-move with R=2 for early game, R=3 for late game, turn off null move when only one piece (not a queen) is left, verify null-move (but not recursively) with a 5 ply reduction when depth is large enough -- and don't do null-move if king danger is high, or a MATE_THREAT is detected, or there was a singular extension in the last N ply.

Either of these could be "plagiarised" [besides just blatantly copying] to some extent, if there were sufficiently many points of commonality. [And I don't think either of these would be of sufficient import to trigger an overall "nonoriginality" finding by itself, unless the details were indeed the same and there was no alternative explanation given (like testing results)].

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Thu Jul 14, 2011 4:53 pm

To be clear: I agree with you (CW) that the filtration step should be modified as the abstraction levels become less detailed. For instance, "standard programming techniques" (as per the Wikipedia blurb) include minimax, null-move, and LMR at the highest level. Similarly, it is a "standard" technique in computer chess to consider isolated pawns. However, I don't agree that "common usage" filtering at the level given above (for isolated pawns) would be so broad as to eliminate any distinctions between the specific choices of sub-features for isolated pawns.

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: The Evidence against Rybka

Post by Chris Whittington » Fri Jul 15, 2011 6:59 pm

BB+ wrote:To be clear: I agree with you (CW) that the filtration step should be modified as the abstraction levels become less detailed. For instance, "standard programming techniques" (as per the Wikipedia blurb) include minimax, null-move, and LMR at the highest level. Similarly, it is a "standard" technique in computer chess to consider isolated pawns. However, I don't agree that "common usage" filtering at the level given above (for isolated pawns) would be so broad as to eliminate any distinctions between the specific choices of sub-features for isolated pawns.
Sorry not to respond sooner but I've been away for a few days and not been thinking about this topic. It's difficult because of the generally loose use of language around the issue and that creates confusion, well, in my head anyway. We're also, I think, in uncharted terrain - plagiarism comparisons are presumably much less beneficial/costly to the finances of the parties and the lawyers, we are mostly out of the realm of law at this stage, so we may well have to do our own route map rather than rely on the wiki to find an appropriate and fair filtration technique.

At filtration stage we need to look at quality and quantity of the things we would be about to compare. And, unlike when filtration is done on code segments, the choice PD or not-PD is quite digital, when we filter on meaning we can perhaps afford to be grey scale, something can be more or it can be less important. Maybe looking at extreme cases might help set some bounds on the problem.

Suppose program XYZ comes up with a new technique, very simple idea, perhaps like LMR or null move, which gives a huge ELO advantage and this program becomes leader. Programmer ABC reverse engineers or sees open source of XYZ, discovers this new technique, and translates it into his own previously functional program, fulfilling your 'plagiarism' test of equivalent meaning. Program ABC catches up with XYZ, neutralising the advantage of the initial 'discovery'.

At the other end of the grey scale, Programmer ABC looks at program PQR, notes the penalty added by PQR for an isolated pawn is 0.237 and alters his ABC penalty from 0.298 to 0.237. This makes not a blind bit of difference of course because the key to evaluation is tuning, especially tuning in association with a tuned book. There''s no ELO difference, but your method would still includes this case with equivalent magnitude to the serious case above. There's no inclusion of quality factors.

There's also a quantity factor. If there's plagiarism (your definition again) of something unique, only in one place, perhaps new - that would appear to be at the strong end of the grey scale. Whereas plagiarism of something in widespread usage, not unique and perhaps been around for a long time would appear at the weak end, perhaps even it is not really possible to say in such a case exactly from where the plagiarism took place - from program A, B, C or D etc.

Again, your method does not factor in quantity either.

So, your current method takes no account at all of the importance or otherwise of the plagiarised thing, it just bean counts the number of instances, you decide on a degree of 'matching', and ascribe a score, these scores get added up and are divided by the count of instances, giving this headline figure of 74%

What is needed at filtration stage is a fair test of quality/quantity (and there may be other factors I didn't think of) which has the effect of either junking the instance, or of somehow addressing its 'importance' and factoring that in.

Anyway, I hope my weedy attempt at defining bounds might be useful - principally it is a thorny multi-disiplinary problem and we know programmers don't like those ;-)

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Sat Jul 16, 2011 9:54 pm

Sorry not to respond sooner but I've been away for a few days and not been thinking about this topic.
Gah, you stole my excuse... I was in Lille on Friday, then planned to be away from a computer another 2 days, but couldn't get the bus ticket I wanted, and so ended up at the Maths Institute here in Oberwolfach a day early. I won't pretend to be able to assimilate everything you say after having just travelled for the whole day, but for now I will just briefly note that GCP was of the opinion that one of the reasons why Fabien/FSF should bring/publicise a case against Rajlich is precisely to make programmers more aware that copyright [on computer programs] extends to any creative expression therein, and not just raw code. As you say, it is uncharted to be sure.

The other comment I'll make now (and probably expand on it later) is that tuning numbers could be considered more of an "automatic" process [hence less creative] than choosing evaluation features, though the amount of "automation" here might depend on the tuning method -- this was one reason I didn't try to rank the importance of evaluation features . Also, as I think I noted previously, I wanted to separate the "features" from the "values", as one could copy one but not the other, and putting them into one basket of "evaluation" would presumably be less sharp of a tool as a "plagiarism detector" -- my guess is that copyright considerations will need to guesstimate how relatively important a specific "feature set" is vis-a-vis internal weightings therein, whereas the ICGA "originality" standard was more a binary yes/no rather than a percentage.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Mon Jul 18, 2011 1:24 pm

It appears that phrase "copying" is now being subjected to semantic gymnastics. So my claims are:

*) The Rybka 1.0 Beta executable contains no literally copied evaluation code from Fruit 2.1.
*) Rybka 1.0 Beta contains sufficiently much creative expression from the Fruit 2.1 evaluation code so as to transgress the ICGA Rules.
*) The question of whether and to what extent Rybka 1.0 Beta and later versions infringe the copyright of Fruit 2.1 will be the subject of future civil action. My own opinion is that this is closer in spirit to the second point.

Personally, if I heard the word "copy", I would not parse it as being limited to literal copying.

Regarding the first point, there is more than one instance of literally copied code from Fruit 2.1 in Rybka 1.0 Beta, any one of which should suffice for probative similarity in a copyright action.

I've been trying to find a list of court decisions that cite the Abstraction-Filtration-Comparison Test, but have been unable to do so as of yet. The best I found (with highs/lows of its applicability) was http://www.ladas.com/Patents/Computer/S ... twa06.html

Another useful reference place starts at http://digital-law-online.info/lpdi1.0/treatise21.html
[...] it is of course essential to any protection of literary property ... that the right cannot be limited literally to the text, else a plagiarist would escape by immaterial variations.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: The Evidence against Rybka

Post by BB+ » Tue Jul 19, 2011 2:06 pm

Disassembling a commercial piece of software and publishing/using the source code in whatever manner and without asking the author requires the permission of a regular COURT.
That's why the ICGA investigators were NOT allowed to disassemble Rybka and discuss its internals in public (like they did).
The ICGA investigators are NOT a regular court, nor did they have Mr. Railich's permission of publishing the source code of Rybka.
Therefore whatever evidence was found against Rybka 1.0 beta due to reverse engineering was achieved by ILLEGAL means.
Well, Mr. Railich is a strong chess player, and the ICGA investigators have checkmated themselves by queering Rybka's license.
I'm not sure in what jurisdiction the poster lives, but disassembling is legal in many places, particularly for "fair use" reasons such as study. The "ICGA investigators" didn't publish any "source code of Rybka" per se, but rather direct disassembly dumps with comments. Finally, none of the Rybka versions that were studied came with a specific license, so I'm at a loss as to what "queering" it could imply. [For that matter, none of the Rybka versions studied were "commercial" at the time of study -- this was one of the reasons why R3 was not considered].

Presumably if Rajlich felt his rights were breached on this matter, he could take the ICGA to court. This seems about as likely as the ICGA filing a defamation suit against someone who unmitigatingly says they used "ILLEGAL means" to find evidence. :cool:
Also the evidence that Rybka is a copy of Fruit is ridiculous:
(1) The most fundamental data structure of a chess program is its board representation. A bitboard representation is fundamentally different from a Mailbox representation.
(2) A chess engine's most fundamental algorithm is its search. Rybka uses PVS, which is fundamentally different from Fruit's MTDf.
(3) The evaluation function is the third major component of a chess program, and has THE major impact on the program's choice of a move.
Rybka often chooses different (usually stronger) moves than Fruit, thus its evaluation is fundamentally different from Fruit's.

So - the three most important congredients of Rybka and Fruit differ greatly. Thus Rybka and Fruit are fundamentally different.
#1 is not that relevant for originality or copyright. It is somewhat like saying that the most fundamental component of a telling a story is the medium used, and then saying that a movie-adaption of a book is thereby different. This would be news to Hollywood producers, who regularly pay for such rights.

#2 is factually wrong, as Fruit does not use MTD(f).

#3 leads one to consider what "chooses a move" means, as it seems to imply some specified amount of time for the computation, e.g., one shouldn't compare Engine X to itself slowed by a factor of 5. Adam Hair has done some work on how to try to normalise this, though my impression is that there is still a good amount of guesswork. Even assuming this has been done, the difference in choice of move can come from either search or evaluation (one example of search-borne differencing would be the early Gothmog/Glaurung, where I think king safety rather that score was a major search driver).

In any event, the most direct way to determine if two evaluation functions differ is to abstract/filter/compare their component parts and the relative influence therein; comparing them functionally via their return value [on a suite of positions] is a less direct though sometimes useful method; comparing move choice -- where search, time management, internal speed, ..., all can affect the measurement -- would typically be worse than previous two methods when one wants to measure differences specifically in evaluation. As noted in a previous post, the first method will also detect infringements of copyright-protected expression that the latter two methods would often elide.

Also, I have no idea what the phrase "fundamentally different" means. Lawyers tend to use phrases such as "substantially similar". I would agree that the evidence that Rybka is a copy of Fruit is "ridiculous" -- but only if the word "copy" is construed as "literal copy", whereas most copyright violations proceed elsewise, as would most plagiarism investigations.

Post Reply