Thoughts on Fruit=Rybka EVAL

General discussion about computer chess...
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Thoughts on Fruit=Rybka EVAL

Post by BB+ » Tue Aug 16, 2011 9:38 pm

Rebel wrote:Yes, there is an extra fixed penalty for an isolated pair.
I have no PS for double pawns, never was able to find a good reason for that.
OK, so I (in EVAL_COMP) would conclude that your isolated pawns method is not a "1.0" overlap with Fruit/Rybka/Faile in this example. In other words, the specific way you chose to implement the general concept of isolated pawns (before any tuning) differed to some degree from that which they chose.

This is what the main focus of EVAL_COMP was -- whether the specifics as to choice and rendition of evaluation features with Fruit/Rybka were more than one could reasonably expect. This was the "abstraction" method used, to overcome the bitboard translation issue that would exist under a "code" standard. One can certainly criticise EVAL_COMP on many grounds (e.g. CW wondering if it "filtered" enough), but I just want to be clear concerning what it measured.

As for the internal ordering in the evaluation function, I reiterate that I don't think there is much there either way. Since this seems to be a serious inquiry, I will put on my to-do list the enumeration, collation, checking, and writing-up of more specifics.

User avatar
Chris Whittington
Posts: 437
Joined: Wed Jun 09, 2010 6:25 pm

Re: Thoughts on Fruit=Rybka EVAL

Post by Chris Whittington » Tue Aug 16, 2011 10:13 pm

BB+ wrote:
Rebel wrote:Yes, there is an extra fixed penalty for an isolated pair.
I have no PS for double pawns, never was able to find a good reason for that.
OK, so I (in EVAL_COMP) would conclude that your isolated pawns method is not a "1.0" overlap with Fruit/Rybka/Faile in this example. In other words, the specific way you chose to implement the general concept of isolated pawns (before any tuning) differed to some degree from that which they chose.

This is what the main focus of EVAL_COMP was -- whether the specifics as to choice and rendition of evaluation features with Fruit/Rybka were more than one could reasonably expect. This was the "abstraction" method used, to overcome the bitboard translation issue that would exist under a "code" standard. One can certainly criticise EVAL_COMP on many grounds (e.g. CW wondering if it "filtered" enough), but I just want to be clear concerning what it measured.

As for the internal ordering in the evaluation function, I reiterate that I don't think there is much there either way. Since this seems to be a serious inquiry, I will put on my to-do list the enumeration, collation, checking, and writing-up of more specifics.

One might also wonder whether the comparators were not cherry picked

User avatar
Rebel
Posts: 515
Joined: Wed Jun 09, 2010 7:45 pm
Real Name: Ed Schroder

Re: Thoughts on Fruit=Rybka EVAL

Post by Rebel » Tue Aug 16, 2011 10:14 pm

BB+ wrote:
Rebel wrote:Yes, there is an extra fixed penalty for an isolated pair.
I have no PS for double pawns, never was able to find a good reason for that.
OK, so I (in EVAL_COMP) would conclude that your isolated pawns method is not a "1.0" overlap with Fruit/Rybka/Faile in this example. In other words, the specific way you chose to implement the general concept of isolated pawns (before any tuning) differed to some degree from that which they chose.
But I reject EVAL_COMP fully. Next to Rybka and Fruit you should have taken Shredder, Junior, Fritz (yes Fritz), Hiarcs. In 2005 those were the programs with good chess knowledge inside, they were on top for a good reason. You would had a whole different outcome.

I reject the whole EVAL comparison BTW. I don't think you have fully understood my criticism in the opening post. Chess programmers have limited choices regarding coding and data. To explain,

DATA: values interpret chess knowledge. And if you do a bad job here (bad values) your program will play miserable. So all the bonus/penalties, tables, PST values in good chess programs are alike, similar and sometimes equal. Ask Bob, since a few days he knows all about it. Was a surprise for me too, but I also learn from the discussions. Miguel has done a fine job here.

CODE: programmers are forced to write speedy code. Given that demand they come up with similar solutions and code.

Chess programming is not math only, there are a few buts...

hyatt
Posts: 1242
Joined: Thu Jun 10, 2010 2:13 am
Real Name: Bob Hyatt (Robert M. Hyatt)
Location: University of Alabama at Birmingham
Contact:

Re: Thoughts on Fruit=Rybka EVAL

Post by hyatt » Tue Aug 16, 2011 10:37 pm

Except that your conclusions are demonstrably wrong.

What say we pick mobility, since most seem to do that?

One can do that in many ways.

(1) Early Crafty used the Slate bitboard approach which had incrementally updated attacks_to/attacks_from bitboards. Those showed which squares the piece on a specific square attacked, or which squares attack the specific square. Given the former, all I had to do was a simple popcnt() operation and multiply it by whatever constant I thought would work well.

(2) In Cray Blitz, we first did all the usual evaluation stuff, and filled in a 64-word array for each side with a number representing the "interest" of that square. Interest might mean this square is close to the opponent's king, or this square in along the path the passed pawn on this file must cross over, or this square is in the center, or this square can't be attacked by an enemy pawn and is weak. One can keep going on what makes a square interesting. For our mobility, thanks to the Cray's clever "vector merge" instruction, I could, for queens, take a bitmap showing which squares the queen attacked, and use that to extract/sum all those "interest" scores but just including the squares attacked by the queen. Tried this in Crafty, but it was way too slow. At least in 1995. Perhaps today..

(3) one can take the fruit approach and simply enumerate the squares a piece attacks, one by one, and then counts the total.

(4) one can do the same thing but exclude squares attacked by enemy pawns.

(5) one can do the same thing but exclude squares attacked by more enemy than friendly pieces.

(6) one can weight squares toward the enemy side of the board higher and squares on your own side less.

(7) one can weigh squares differently depending on their closeness to the center (Crafty does this at present).

(8) once can do as I do and precompute the mobility for each piece type before the game starts, just like I precompute the move generator stuff for the magic move stuff. Then a mobility evaluation is just a table lookup and very fast.

(9) one can do (8) combined with most any of the previous examples.

(10) the above can be combined for more options.

(11) and finally, there are several ways to implement each approach.

One "idea". Hundreds of choices for implementation details...

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Thoughts on Fruit=Rybka EVAL

Post by BB+ » Wed Aug 17, 2011 12:40 am

There are some historical examples of eval features at https://icga.wikispaces.com/Evaluation+Overlap
Even with many of these being quite "vague", there is noticeable divergence.
Rebel wrote:Chess programmers have limited choices regarding coding and data.
The fact that there are a such wide variety of eval functions around essentially rebuts this. Back in 1995, did HIARCS, REBEL, Junior, MChess, ..., [top programs on the same hardware] all manage to end up with similar evaluation functions from "limited" choices therein?

Or a (small) selection from http://www.rebel.nl/authors.htm
The King wrote:From the start, The King was given an attractive and enterprising playing style. Unlike many other computer programs, The King actively seeks attacking possibilities and is ready to sacrifice material not only on tactical, but also on positional grounds. Results and playing strength of the program steadily increased, and it has been among the world’s strongest ever since 1990.
Fritz wrote:Fritz is build around a selective search technique known as the null-move search. As part of its search, Fritz allows one side to move twice (the other side does a null-move). This allows the program to detect weak moves before they are searched to their full depth. Move generators, evaluation functions and data structures have been designed to maximize the effectiveness of the null-move search.
Gandalf wrote:Gandalf was started around 1985 by Steen Suurballe. The program was a rule-based selective program, which was very slow, but did surprisingly well. In 1993 Dan Wulff joined in the work, and has been doing the opening library ever since.
In 1995 Steen decided to skip the selective search, and concentrate on the evaluation function. The program got much stronger after this change, and although it has become a lot faster than the prior version, it is still rater slow, when compared with other programmes.
The search was changed to a standard alpha/beta search, with null-move reductions, and a lot of extensions.
HIARCS wrote:HIARCS searches around an order of magnitude less positions per second (av. 18,000) than most of its competitors. However, it makes up for this apparent slow speed by clever searching and accurate evaluation.

HIARCS uses many selective search extension heuristics to guide the search and incorporates a sophisticated tapered search to resolve tactical uncertainties while finding positionally beneficial lines.
Does this sound like they were "limited" in their differential aspects? [Fritz actually designed eval to maximise null-move effectivity?!].
Rebel wrote: But I reject EVAL_COMP fully. Next to Rybka and Fruit you should have taken Shredder, Junior, Fritz (yes Fritz), Hiarcs. In 2005 those were the programs with good chess knowledge inside, they were on top for a good reason. You would had a whole different outcome.
I disagree (quite strongly) with your last sentence. I see little reason to think that these engines used evaluation features that had large overlap. There is little if any evidence to think that engines "at the top" must have similar evaluation, either in 2005, or at any juncture. Currently, one can note that Don Dailey is currently championing the evaluation of Komodo as being its main plus. R232a had a Fruit-like eval, R3 has a heavyweight eval, while R4 is back to a lightweight one [though not so Fruit-like]. Yet all of them were at the top in their time. Similarly, Stockfish differs from Critter.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Thoughts on Fruit=Rybka EVAL

Post by BB+ » Wed Aug 17, 2011 2:43 am

Here is my listing for Rybka/Fruit eval ordering.
Here is the Fruit top-level:
   material_get_info(mat_info,board);                                           
   opening += board->opening;   endgame += board->endgame; // PST               
   opening += mat_info->opening; endgame += mat_info->endgame;                  
   pawn_get_info(pawn_info,board); // also add in the open/endg score           
   eval_draw(board,mat_info,pawn_info,mul); // no Rybka equivalent              
/// ...
   eval_piece(board,mat_info,pawn_info,&opening,&endgame);                      
   eval_king(board,mat_info,&opening,&endgame);                                 
   eval_passer(board,pawn_info,&opening,&endgame);                              
   eval_pattern(board,&opening,&endgame);                                       
   phase = mat_info->phase; // interpolation                                    
   eval = ((opening * (256 - phase)) + (endgame * phase)) / 256;                
   // drawish bishop endgames
/// ...
   // draw bound // this essentially is in the Rybka mat-table, I think

Internal to these there will be
* a choice of order of elements in pawn evaluation (pawn_get_info)
* a choice of order of pieces in eval_piece, then of features for each piece
* a choice of whether to do pieces or pawns first in eval_king
* a choice of order of what passer elements to consider
* a choice of what order to do the patterns, also interspersing white/black.

As can be seen, third one is not much in the way of content; Fruit chooses pieces then
pawns in king safety, as does Rybka. The fourth one also only has a few elements.
Here is Rybka reconstructed at the top-level:
*) Get the material token, imbalance
*) Consider lazy eval
*) prefetch the pawnhash entry [can be reordered by the compiler]
*) add the static values (to the material value)
*) evaluate white pieces (PNBRQ)
*) call pawneval
*) evaluate black pieces (PNBRQ)
*) evaluate king safeties (pieces then pawn shelter/storm)
*) evaluate passers
*) evaluate patterns (trapped bishops then blocked bishops, then blocked rooks)
*) interpolate open/endg
*) adjustment in drawish bishop endgames

However, some of this ordering is "forced" by the context (e.g., piece_eval must precede
that part of king safety), and/or is the "plain obvious" thing to do.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Thoughts on Fruit=Rybka EVAL

Post by BB+ » Wed Aug 17, 2011 3:04 am

The "eval_piece" equivalent in Rybka is to loop over the bitboards in PNBRQ order, while Fruit
loops over piece-lists. These are not too comparable. Furthermore, Rybka computes the "piece king safety"
in these loops, while Fruit does not, choosing to compute that in the eval_king routine (which makes a
second loop over piece lists)
For patterns, there is for Fruit
trapped bishops 7th rank: white queenside, then kingside; and black queenside, then kingside             
trapped bishops 6th rank: white queenside, then kingside; and black queenside, then kingside             
blocked bishops: white queenside, then kingside; and black queenside, then kingside             
blocked rooks: white queenside, then kingside; and black queenside, then kingside             

Rybka does not distinguish between 7th/6th rank for trapped bishops, but otherwise has the same order.
Rybka also only scores a maximum of one trapped bishop per color.
For passers, Fruit computes a base score, then a dangerous bonus (free passer), then a king-distance score.

Rybka has the same order, though "free passer" has its scoring capability subdivided into 3 parts.

Inside the "free passer" logic, the order for Fruit is:
 if the pawn is not blocked, // split into 2 parts in Rybka
 and if the pawn can "safely" advance (computed differently in the two), then a bonus is added.
Rybka has
 if we do not block the pawn, add a bonus
 if they do not block the pawn, add a bonus
 if the pawn can "safely" advance, add a bonus
Finally there are orderings inside each piece.
In each case, Rybka appends the "king safety" calculation at the beginning, while as noted above,
Fruit has a second loop over the piece-list for this. For most pieces there is so little content as to be ignorable;                  
only rooks have much to say, where the order is common.

Knights: compute mobility (nothing else)
Bishops: compute mobility (nothing else)
Rooks: compute mobility, SemiOpen, Open [including opp king], 7th rank
Queens: compute mobility, then 7th rank

There is also the statement to see if "7th rank" should count, which both do by first seeing if the opponent
has a pawn on 7th, and then (if not) seeing if the opponent has a king on the 8th.
Lastly there is pawneval. Here is the ordering for Fruit, which separated
(possibly for development reasons) the computation and scoring phases

compute doubled
compute isolated/backward [if-switch, can't be both, of course]
compute open
compute candidate
score doubled, then isolated (open/closed), then backward (open/closed), then candidate

Rybka does doubled, isolated/backward, then candidate, but scoring along the way.

Fruit computes the shelter/storm in eval_king, while Rybka does it in pawneval (using FileWing). I don't think
the two are very comparable (Rybka has its characteristic 4x3 blocks, while Fruit loops over the 3 files).
Overall, I just can't see much either way. There are various places where Rybka follows a different agenda (such as re-factoring the code), but on the other hand, there are an assortment of orderings (most rather minor by themselves) that are candidates for being different, that in fact are not. One could argue that in most of the places where there is a re-ordering, there is some external reason for this, possibly from the performance standpoint (like shelter_storm in pawn_eval), or maybe due to ease of development. However, I'm not completely convinced that every Rybka/Fruit difference could be explained by this.

An alternative example where Rybka does re-order a few Fruit elements is in the UCI parsing, where Fruit works alphabetically (binc/btime/depth/infinite/mate/movestogo/movetime/nodes/ponder/searchmoves/winc/wtime), while Rybka reorders this (winc/wtime/binc/btime/depth/infinite/movestogo/movetime/ponder), placing winc/wtime at the front (and not having mate/nodes/searchmoves). So other than winc/wtime at the front, it's alphabetical in Rybka -- but the winc/wtime change shouldn't be simply ignored. Again it is unclear to me whether there might be some external "testing"(?) reason why one might want to move winc/wtime to the front. Zach or I could track down the order in Rybka 1.6.1, if it is thought relevant.

BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Re: Thoughts on Fruit=Rybka EVAL

Post by BB+ » Wed Aug 17, 2011 3:25 am

BB+ wrote:An alternative example where Rybka does re-order a few Fruit elements is in the UCI parsing [...] Zach or I could track down the order in Rybka 1.6.1, if it is thought relevant.
Here is the Rybka 1.6.1 order:

Code: Select all

0x00444697:     push   $0x4b15d0 // searchmoves
0x00444701:     push   $0x4b15b0 // ponder
0x00444735:     push   $0x4b15a0 // wtime
0x004447a6:     push   $0x4b1598 // btime
0x00444817:     push   $0x4b1590 // winc
0x00444888:     push   $0x4b1588 // binc
0x004448f9:     push   $0x4b157c // movestogo
0x00444946:     push   $0x4b1574 // depth
0x0044495b:     push   $0x4b156c // nodes
0x004449aa:     push   $0x4b1564 // mate
0x004449ec:     push   $0x4b1558 // movetime
0x00444a5b:     push   $0x4b154c // infinite
OTOH, switching to alphabetical (with winc/wtime at the front) makes some amount of sense as a development consideration, so at best I think we are guessing. As with many things, asking Rajlich about it would seem to be the most logical path toward resolution.

User avatar
Rebel
Posts: 515
Joined: Wed Jun 09, 2010 7:45 pm
Real Name: Ed Schroder

Re: Thoughts on Fruit=Rybka EVAL

Post by Rebel » Wed Aug 17, 2011 8:33 am

BB+ wrote: Overall, I just can't see much either way. There are various places where Rybka follows a different agenda (such as re-factoring the code), but on the other hand, there are an assortment of orderings (most rather minor by themselves) that are candidates for being different, that in fact are not. One could argue that in most of the places where there is a re-ordering, there is some external reason for this, possibly from the performance standpoint (like shelter_storm in pawn_eval), or maybe due to ease of development. However, I'm not completely convinced that every Rybka/Fruit difference could be explained by this.

An alternative example where Rybka does re-order a few Fruit elements is in the UCI parsing, where Fruit works alphabetically (binc/btime/depth/infinite/mate/movestogo/movetime/nodes/ponder/searchmoves/winc/wtime), while Rybka reorders this (winc/wtime/binc/btime/depth/infinite/movestogo/movetime/ponder), placing winc/wtime at the front (and not having mate/nodes/searchmoves). So other than winc/wtime at the front, it's alphabetical in Rybka -- but the winc/wtime change shouldn't be simply ignored. Again it is unclear to me whether there might be some external "testing"(?) reason why one might want to move winc/wtime to the front. Zach or I could track down the order in Rybka 1.6.1, if it is thought relevant.
Thank you for clarification.

To say it in a few words, you basically discovered that comparing the eval of 2 good chess programs contain about the same amount of chess knowledge and its implementation look similar.

I could have told you that from the beginning.

User avatar
Rebel
Posts: 515
Joined: Wed Jun 09, 2010 7:45 pm
Real Name: Ed Schroder

Re: Thoughts on Fruit=Rybka EVAL

Post by Rebel » Wed Aug 17, 2011 8:51 am

BB+ wrote:There are some historical examples of eval features at https://icga.wikispaces.com/Evaluation+Overlap
Even with many of these being quite "vague", there is noticeable divergence.
Rebel wrote:Chess programmers have limited choices regarding coding and data.
The fact that there are a such wide variety of eval functions around essentially rebuts this. Back in 1995, did HIARCS, REBEL, Junior, MChess, ..., [top programs on the same hardware] all manage to end up with similar evaluation functions from "limited" choices therein?

Or a (small) selection from http://www.rebel.nl/authors.htm
The King wrote:From the start, The King was given an attractive and enterprising playing style. Unlike many other computer programs, The King actively seeks attacking possibilities and is ready to sacrifice material not only on tactical, but also on positional grounds. Results and playing strength of the program steadily increased, and it has been among the world’s strongest ever since 1990.
Fritz wrote:Fritz is build around a selective search technique known as the null-move search. As part of its search, Fritz allows one side to move twice (the other side does a null-move). This allows the program to detect weak moves before they are searched to their full depth. Move generators, evaluation functions and data structures have been designed to maximize the effectiveness of the null-move search.
Gandalf wrote:Gandalf was started around 1985 by Steen Suurballe. The program was a rule-based selective program, which was very slow, but did surprisingly well. In 1993 Dan Wulff joined in the work, and has been doing the opening library ever since.
In 1995 Steen decided to skip the selective search, and concentrate on the evaluation function. The program got much stronger after this change, and although it has become a lot faster than the prior version, it is still rater slow, when compared with other programmes.
The search was changed to a standard alpha/beta search, with null-move reductions, and a lot of extensions.
HIARCS wrote:HIARCS searches around an order of magnitude less positions per second (av. 18,000) than most of its competitors. However, it makes up for this apparent slow speed by clever searching and accurate evaluation.

HIARCS uses many selective search extension heuristics to guide the search and incorporates a sophisticated tapered search to resolve tactical uncertainties while finding positionally beneficial lines.
Does this sound like they were "limited" in their differential aspects? [Fritz actually designed eval to maximise null-move effectivity?!].
Programmers promoting their baby naming some specifics they feel the thing excels. Does that automatically imply that item is not in other programs ?
BB+ wrote: I disagree (quite strongly) with your last sentence. I see little reason to think that these engines used evaluation features that had large overlap.
But you are not a chess programmer.

And you haven't tried Junior, Hiarcs, Fritz.

But you can take Rebel, most of the eval stuff is an easy read.

Post Reply