by dragonKnight » Thu Jan 05, 2012 4:43 pm
The impact may not only be on practical chess program but on the scientific foundation of the field.
The direction of some sciences is for more quantitative papers than the practical papers suggesting useful search methods.
Therefore in order to make computer chess a mainstream direction in general artificial intelligence and to connect it to the theory of mathematics and theoretical computer science a mathematical foundation must be given.
Fundamentally, the article does not seem to focus on a new heuristic even if the heuristic presented looks to be of practical value and belongs to a class known to be very successful in practice in these days. The aim of the article seems to be to create a scientific foundation for the well known method of partial depth.
The efficiency of some well known methods in computer chess is explained through a theory based on information theory. In this respect it may be a change of paradigm from hand crafted heuristics to mathematically modeled heuristics in computer chess and in other areas of search.
The idea is unique in the history of computer chess. And this does not mean that alpha-beta has been changed completely as a decision mechanism, but the change consists in the exploration method which has an important effect on the decision. It can be added to this method all the heuristics used by normal alpha-beta without exception, but comparing the method with the classic alpha-beta then the difference is relevant and if additional heuristics are used the difference remains.
The exploration expands more often the moves with the highest information value. The concept of information is not used in a sense similar to that in the general vocabulary, but it is precisely that in the theory of information as the equations at the root of the model show.
The method of fraction ply is a generalization of alpha-beta because in classical alpha beta with each ply of search it is added 1. In the fraction depth scheme is added not 1 but a value that can be bigger than 1 in the case of uninteresting moves and smaller than 1 in the case of moves that are probably good. The value added by the fraction ply method can also be 1 as a particular case. The fraction ply method has not been developed in this article but it is one of the best methods used now. Very few has been published about the fraction ply method, as one can see. Actually it is likely only 3 papers have been written about the partial depths scheme, two about this method in chess and one in shoji. 2 papers are very few considering that this is the best or at least one of the most popular methods now. This paper shows also new types of experiments never published by other articles. For instance the effect of information gain on the number of nodes searched.
However the main contribution of this article is that unlike the previous methods used for computer chess, and which have been mostly obtained experimentally and with handcrafted functions and coefficients, this method is based on the axioms and the theory of information theory, in that it calculates a number corresponding to the quality of a move based on the change in relative entropy of the board state. Previous articles on the partial depth scheme calculated the size of the fraction ply based on data mining from games or using intuitively hand crafted formulas that would work.
It is known the fraction depth ply works well, however this shows why. The fraction depth ply works by expanding exactly the lines with a faster decrease in entropy (uncertainty) on the positions. For example , captures and other moves decrease the entropy of the position because they eliminate the source of entropy, the pieces. The relative entropy of a piece is as shown in the article posted above, in the form of log(mobility) and therefore the entropy of the position is a function of the mobility. The uncertainty on the value of the position is dependent on the mobility of pieces. This is something known from practice.
Showing how the most important coefficient, ( the fraction ply added with each level of depth instead of 1 ) results from a mathematical theory (information theory) is an important contribution, because this is actually the most important element of the search process. Therefore the method justifies not only the search procedure but what people have found experimentally over 3 decades or more. Nobody has shown this before !
It is like calculating a constant in physics.
The article defines a metric for the gain of information in computer chess search. Actually it is a pseudo metric, mathematically speaking but still it is important. For decision trees this pseudo-metric is the leibler-kulback divergence and it is considered important. It is defined an analogous concept for computer chess.
A significant difference from the previous fraction ply methods published is that while they put the formula intuitively with no mathematical justification, here it is derived by calculation as one can see, from first principles of information theory with few and very intuitive assumptions.
An important fact is also that while the few other fraction ply methods published use formulas based on intuitive ideas for the 8x8 chess , the popular game, the formula used in this article is based on relative entropies of pieces calculated for the NxN general chess which is a EXPTIME-COMPLETE problem in computer science. In this way, it connects the theory of the 8x8 chess and of computer chess to the NxN computer science problem.
Last edited by
dragonKnight on Thu Jan 05, 2012 4:53 pm, edited 1 time in total.