Retrograde tablebase methods

Code, algorithms, languages, construction...
Post Reply
BB+
Posts: 1484
Joined: Thu Jun 10, 2010 4:26 am

Retrograde tablebase methods

Post by BB+ » Fri Nov 26, 2010 2:26 am

Since I mentioned these two methods elsewhere, I thought I would describe them more fully here. H. G. Muller used to have use on his web page, but the link seems broken now, and the info about the 7-piece generator is somewhat different.

The "grandfather" and "out-counts" methods of building tablebases

One way to build tablebases is to work in an iterative retrograde manner. Given positions that are known to be losses-in-X (where X is the number of ply, and we start this at X=0 where a side is mated), we then generate all "takeback" moves, and each resulting position is a win, at worst at win-in-(X+1).

The bifurcation in methods occurs when we want to go from wins-in-(X+1) to losses-in-(X+2). Again we take a position known to be a win-in-(X+1) and generate "takeback" moves. However, a takeback of a position that is a win need not yield a position that is a loss. The "grandfather" method determines whether a target position is a loss by then generating all forward moves and seeing if all of them are known to be wins -- if so, then the position is a loss (in X+2). I'm not sure "grandfather" is the best term, as it is more like siblings.

The "out-counting" method does a large pre-computation at the start, counting how many legal moves each position has. When a win-in-(X+1) leads to a position via a takeback, an "out" (perhaps a term from poker) is subtracted from the count of this position. When the count reaches zero, there are no moves left that don't lose, and so the position can be marked as a loss (in X+2).

That's it in a nutshell, though there are many delicacies, particularly with symmetries in the out-counting method. The use of DTC makes either method a bit easier, as with DTM (in the most straightforward manner) you need to look at reductions to lesser material at some point.

One disadvantage of the grandfather method is that you are constantly generating "forward" moves and testing whether the results are known. For instance, in a QR vs Q set-up, every time a win-in-(X+1) is found, you nominally need to generate the possible moves for the losing side (maybe about 25 of these), see if any gives an unknown result, etc. This makes it notably slower in many instances.

With the out-counting method, the pre-computation can in some circumstances be overkill, for instance when most of the positions are draws or quick wins via reduction by a capture, and so you don't really care exactly how many outs/moves there are for most positions. The out-counting method also suffers a bit with "heavy" pieces that have many possible moves, though I don't think as much as with the grandfather method.

The out-counting method takes more memory, as you need to store: a bit array for positions that are known to be wins, an array that lists the number of remaining "outs" (moves not known to lose) for each position, and two lists that give wins/losses at the most recent iteration (possibly given as bit arrays, at least when the number of such positions is large). If the number of outs is stored (say) as a byte [though a reduction to 6 bits and maybe less should be possible], this requires between 9/8 and 11/8 bits per position, depending on how the lists are stored. [We can assume you are building both a tablebase and its complement at the same time, though you don't need to access both simultaneously].

The grandfather method also needs 4 arrays, but now the array for the remaining number of outs can be replaced by a bit array for positions known to be losses. So essentially we replace a byte by a bit. This means it is more like something between 2/8 and 4/8 bits per position. When building from disk, the reduction of space usage by a factor of 3 or more can easily offset the slower computations with the grandfather method. If the whole computation fits into RAM, the out-counting method will be much faster. I am not sure how much the above arrays can be compressed (and indeed, a more parsimonious indexing scheme following Nalimov/Haworth/Heinz might profitably be used here), with there again being some sort of time/space trade-off in doing such a compression.

Post Reply