Information and search in computer chess (Godescu)

BB+ · Post by **BB+** » Mon Dec 12, 2011 6:11 am

Another offering from academia: http://arxiv.org/abs/1112.2149

Abstract
The article describes a model of chess based on information theory. A mathematical model of the partial depth scheme is outlined and a formula for the partial depth added for each ply is calculated from the principles of the model. An implementation of alpha-beta with partial depth is given . The method is tested using an experimental strategy having as objective to show the effect of allocation of a higher amount of search resources on areas of the search tree with higher information. The search proceeds in the direction of lines with higher information gain. The effects on search performance of allocating higher search resources on lines with higher information gain are tested experimentaly and conclusive results are obtained. In order to isolate the effects of the partial depth scheme no other heuristic is used.

hyatt · Post by **hyatt** » Mon Dec 12, 2011 3:57 pm

and not a very helpful one either.

dragonKnight · Post by **dragonKnight** » Thu Jan 05, 2012 4:43 pm

The impact may not only be on practical chess program but on the scientific foundation of the field.

The direction of some sciences is for more quantitative papers than the practical papers suggesting useful search methods.
Therefore in order to make computer chess a mainstream direction in general artificial intelligence and to connect it to the theory of mathematics and theoretical computer science a mathematical foundation must be given.

Fundamentally, the article does not seem to focus on a new heuristic even if the heuristic presented looks to be of practical value and belongs to a class known to be very successful in practice in these days. The aim of the article seems to be to create a scientific foundation for the well known method of partial depth.

The efficiency of some well known methods in computer chess is explained through a theory based on information theory. In this respect it may be a change of paradigm from hand crafted heuristics to mathematically modeled heuristics in computer chess and in other areas of search.

The idea is unique in the history of computer chess. And this does not mean that alpha-beta has been changed completely as a decision mechanism, but the change consists in the exploration method which has an important effect on the decision. It can be added to this method all the heuristics used by normal alpha-beta without exception, but comparing the method with the classic alpha-beta then the difference is relevant and if additional heuristics are used the difference remains.

The exploration expands more often the moves with the highest information value. The concept of information is not used in a sense similar to that in the general vocabulary, but it is precisely that in the theory of information as the equations at the root of the model show.

The method of fraction ply is a generalization of alpha-beta because in classical alpha beta with each ply of search it is added 1. In the fraction depth scheme is added not 1 but a value that can be bigger than 1 in the case of uninteresting moves and smaller than 1 in the case of moves that are probably good. The value added by the fraction ply method can also be 1 as a particular case. The fraction ply method has not been developed in this article but it is one of the best methods used now. Very few has been published about the fraction ply method, as one can see. Actually it is likely only 3 papers have been written about the partial depths scheme, two about this method in chess and one in shoji. 2 papers are very few considering that this is the best or at least one of the most popular methods now. This paper shows also new types of experiments never published by other articles. For instance the effect of information gain on the number of nodes searched.

However the main contribution of this article is that unlike the previous methods used for computer chess, and which have been mostly obtained experimentally and with handcrafted functions and coefficients, this method is based on the axioms and the theory of information theory, in that it calculates a number corresponding to the quality of a move based on the change in relative entropy of the board state. Previous articles on the partial depth scheme calculated the size of the fraction ply based on data mining from games or using intuitively hand crafted formulas that would work.

It is known the fraction depth ply works well, however this shows why. The fraction depth ply works by expanding exactly the lines with a faster decrease in entropy (uncertainty) on the positions. For example , captures and other moves decrease the entropy of the position because they eliminate the source of entropy, the pieces. The relative entropy of a piece is as shown in the article posted above, in the form of log(mobility) and therefore the entropy of the position is a function of the mobility. The uncertainty on the value of the position is dependent on the mobility of pieces. This is something known from practice.

Showing how the most important coefficient, ( the fraction ply added with each level of depth instead of 1 ) results from a mathematical theory (information theory) is an important contribution, because this is actually the most important element of the search process. Therefore the method justifies not only the search procedure but what people have found experimentally over 3 decades or more. Nobody has shown this before !

It is like calculating a constant in physics.

The article defines a metric for the gain of information in computer chess search. Actually it is a pseudo metric, mathematically speaking but still it is important. For decision trees this pseudo-metric is the leibler-kulback divergence and it is considered important. It is defined an analogous concept for computer chess.

A significant difference from the previous fraction ply methods published is that while they put the formula intuitively with no mathematical justification, here it is derived by calculation as one can see, from first principles of information theory with few and very intuitive assumptions.

An important fact is also that while the few other fraction ply methods published use formulas based on intuitive ideas for the 8x8 chess , the popular game, the formula used in this article is based on relative entropies of pieces calculated for the NxN general chess which is a EXPTIME-COMPLETE problem in computer science. In this way, it connects the theory of the 8x8 chess and of computer chess to the NxN computer science problem.

dragonKnight · Post by **dragonKnight** » Thu Jan 05, 2012 4:50 pm

a more general theory is developed in the article :

An Information Theoretic Analysis of Decision in Computer Chess (also written by alexandru godescu)

http://arxiv.org/abs/1112.2144v1

the abstract is:

The basis of the method proposed in this article is the idea that information is one of the most important factors in strategic decisions, including decisions in computer chess and other strategy games. The model proposed in this article and the algorithm described are based on the idea of a information theoretic basis of decision in strategy games . The model generalizes and provides a mathematical justification for one of the most popular search algorithms used in leading computer chess programs, the fractional ply scheme. However, despite its success in leading computer chess applications, until now few has been published about this method. The article creates a fundamental basis for this method in the axioms of information theory, then derives the principles used in programming the search and describes mathematically the form of the coefficients. One of the most important parameters of the fractional ply search is derived from fundamental principles. Until now this coefficient has been usually handcrafted or determined from intuitive elements or data mining. There is a deep, information theoretical justification for such a parameter. In one way the method proposed is a generalization of previous methods. More important, it shows why the fractional depth ply scheme is so powerful. It is because the algorithm navigates along the lines where the highest information gain is possible. A working and original implementation has been written and tested for this algorithm and is provided in the appendix. The article is essentially self-contained and gives proper background knowledge and references. The assumptions are intuitive and in the direction expected and described intuitively by great champions of chess.

hyatt · Post by **hyatt** » Thu Jan 05, 2012 5:13 pm

dragonKnight wrote:The impact may not only be on practical chess program but on the scientific foundation of the field.

The direction of some sciences is for more quantitative papers than the practical papers suggesting useful search methods.
Therefore in order to make computer chess a mainstream direction in general artificial intelligence and to connect it to the theory of mathematics and theoretical computer science a mathematical foundation must be given.

Fundamentally, the article does not seem to focus on a new heuristic even if the heuristic presented looks to be of practical value and belongs to a class known to be very successful in practice in these days. The aim of the article seems to be to create a scientific foundation for the well known method of partial depth.

The efficiency of some well known methods in computer chess is explained through a theory based on information theory. In this respect it may be a change of paradigm from hand crafted heuristics to mathematically modeled heuristics in computer chess and in other areas of search.

The idea is unique in the history of computer chess. And this does not mean that alpha-beta has been changed completely as a decision mechanism, but the change consists in the exploration method which has an important effect on the decision. It can be added to this method all the heuristics used by normal alpha-beta without exception, but comparing the method with the classic alpha-beta then the difference is relevant and if additional heuristics are used the difference remains.

The exploration expands more often the moves with the highest information value. The concept of information is not used in a sense similar to that in the general vocabulary, but it is precisely that in the theory of information as the equations at the root of the model show.

The method of fraction ply is a generalization of alpha-beta because in classical alpha beta with each ply of search it is added 1. In the fraction depth scheme is added not 1 but a value that can be bigger than 1 in the case of uninteresting moves and smaller than 1 in the case of moves that are probably good. The value added by the fraction ply method can also be 1 as a particular case. The fraction ply method has not been developed in this article but it is one of the best methods used now. Very few has been published about the fraction ply method, as one can see. Actually it is likely only 3 papers have been written about the partial depths scheme, two about this method in chess and one in shoji. 2 papers are very few considering that this is the best or at least one of the most popular methods now. This paper shows also new types of experiments never published by other articles. For instance the effect of information gain on the number of nodes searched.

However the main contribution of this article is that unlike the previous methods used for computer chess, and which have been mostly obtained experimentally and with handcrafted functions and coefficients, this method is based on the axioms and the theory of information theory, in that it calculates a number corresponding to the quality of a move based on the change in relative entropy of the board state. Previous articles on the partial depth scheme calculated the size of the fraction ply based on data mining from games or using intuitively hand crafted formulas that would work.

It is known the fraction depth ply works well, however this shows why. The fraction depth ply works by expanding exactly the lines with a faster decrease in entropy (uncertainty) on the positions. For example , captures and other moves decrease the entropy of the position because they eliminate the source of entropy, the pieces. The relative entropy of a piece is as shown in the article posted above, in the form of log(mobility) and therefore the entropy of the position is a function of the mobility. The uncertainty on the value of the position is dependent on the mobility of pieces. This is something known from practice.

Showing how the most important coefficient, ( the fraction ply added with each level of depth instead of 1 ) results from a mathematical theory (information theory) is an important contribution, because this is actually the most important element of the search process. Therefore the method justifies not only the search procedure but what people have found experimentally over 3 decades or more. Nobody has shown this before !

It is like calculating a constant in physics.

The article defines a metric for the gain of information in computer chess search. Actually it is a pseudo metric, mathematically speaking but still it is important. For decision trees this pseudo-metric is the leibler-kulback divergence and it is considered important. It is defined an analogous concept for computer chess.

A significant difference from the previous fraction ply methods published is that while they put the formula intuitively with no mathematical justification, here it is derived by calculation as one can see, from first principles of information theory with few and very intuitive assumptions.

An important fact is also that while the few other fraction ply methods published use formulas based on intuitive ideas for the 8x8 chess , the popular game, the formula used in this article is based on relative entropies of pieces calculated for the NxN general chess which is a EXPTIME-COMPLETE problem in computer science. In this way, it connects the theory of the 8x8 chess and of computer chess to the NxN computer science problem.

Fractional plies have been used for 30+ years. the 1980 Belle chess machine used them. Ken experimented with the idea of the "recapture extension" in reverse. That is, if you were recapturing a piece (a forced move) he reduced depth by 1/2 ply (as opposed to the more normal idea of simply increasing depth by some amount varying up to 1 whole ply, but then always reducing depth by 1 to go to the next level in the search. It is not new at all..

Variable fractional ply extensions and reductions have also been used for a LONG time. That is, the reduction or extension is varied based on some dynamic characteristic observed in the search...

dragonKnight · Post by **dragonKnight** » Thu Jan 05, 2012 8:10 pm

Yes, actually the article mentions that the fraction ply method has been used before and gives some examples of previous articles on the topic , but instead of using
chess theory or heuristic arguments to calculate the size of the ply, it uses the entropy rate of pieces and entropy based measures of positional features, something never done before. The fraction ply is calculated mathematical in the articles of Alexandru Godescu and not handcrafted or build based on chess theory reasoning. Also the formulas are generalizable to the NxN chess, something never dome before. In addition the information gain is defined for the first time in chess and computer chess. The theory of pathology in game trees and its implications on chess are captured by a general equations from algorithmic information theory. this is also something absolutely new. A theory of trajectories of pieces and game state is constructed based on entropy , an additional contribution to the foundation of the field.
Information theoretical arguments explaining the success of using detabases and opening bases and previous games are given. The error in the evaluation
functions compared to a perfect evaluation are described through an information channel , which is also mathematically speaking, a function. It is also shown why the
ordering of moves starting with active moves is the most efficient in practice, based on Solomonoff logic. and these formulas explain also the logic of chess not only that of computer chess, in a way never done before. so there are many different things compared to any previous research. However, the previous contributions are mentioned in the two articles, and also the way the previous formulas can be derived from the theory developed by the two articles is presented. The seccond article shows a new theory of search in chess explaining more than 20 previously known experimental results. Some of the facts were known before and experimentally verified
but never explained starting from few theoretical principles. The value is not in claiming the results but in presenting a way the results are a consequence to a more general theory , based on the axioms of information theory

OpenChess

OpenChess

Information and search in computer chess (Godescu)

Information and search in computer chess (Godescu)

Re: Information and search in computer chess (Godescu)

Re: Information and search in computer chess (Godescu)

Re: Information and search in computer chess (Godescu)

Re: Information and search in computer chess (Godescu)

Re: Information and search in computer chess (Godescu)