In fact, in practise I might guess that the "improvement part" (step 2 of algorithm 3) could be more relevant than the rest. Maybe the speed of feedback allows one to tune multiple parameters more easily than with game testing, so issues of hill-climbing methodology are not so prevalent?mballicora wrote:(though they do not describe the improvement part)
I too have typically found the silence about "tuning" in general rather odd, with the main chatter being various personae (notably VD, and VR to some extent) claiming that they [or the NSA] have super-secret methods with great superiority to whatever everyone is using.I made comments about it later, but almost always very few or nobody was interested to talk with very few exceptions. This "tuning" topic always involved some sort of secrecy, or nobody considered it seriously. Of course, now the people will look at it differently.
I had mused that this might be due to the drawishness measurement in Junior, but now you report the same result with no such parameter. MvK says: I'm speculating that this style is because the tuning now favors positions where humans tend to lose their games -- but using comp-comp games should reduce that? It is somewhat of a mystery...It seems that the method tends to give a dynamice style?
I tend to be skeptical and think some of this to be hyperbole, but would enjoy being proven incorrect.Amir Ban wrote:[...] Chess programmers still code expert chess knowledge into their evaluation functions, to the best of their understanding and capability, subject to its observed success in improving the results achieved by their programs (itself a time-consuming and statistically error-prone determination).
The field lacks a theory of evaluation which is able to suggest answers to any of the following questions: What is the meaning of the numerical value of the evaluation? What makes an evaluation right? In what sense can it be wrong? Between two evaluation functions, how to judge which is better?
It is the purpose of this article to suggest a theory of evaluation within which these questions and related ones can be answered. Furthermore I will show that once such a theory is formulated, it may immediately and economically be tested vis-a-vis the entire wealth of recorded chess game repertoire. As a result I will derive a novel automatic learning procedure that works by fitting its evaluation to recorded game results. While new to computer games, this method bears similarity to techniques of maximum-likelihood optimization and logistic regression widely used in medical and social sciences.
BB+ wrote:I had mused that this might be due to the drawishness measurement in Junior, but now you report the same result with no such parameter. MvK says: I'm speculating that this style is because the tuning now favors positions where humans tend to lose their games -- but using comp-comp games should reduce that? It is somewhat of a mystery...It seems that the method tends to give a dynamice style?
Not to quibble, but I can now count 4: you, MB, AB, and also DD:marcelk wrote:It is indeed interesting to see three independent reports of the same phenomena ("dynamic Tal-like play") emerging from comparable tuning methods.
Don Dailey wrote: Like your experience we found that the program was aggressive - it was making all sorts of interesting sacrifices. I cannot say that they were all sound, but we were testing at limited depth, I think 6 ply. At any depth, a move is a calculated gamble and being wrong about a sacrifice, whether winning or losing, can cost you. So if there is to be error then why not error on the side of being too aggressive once in a while? We didn't keep that version even though it tested quite well for technical reasons.
Return to Programming and Technical Discussions
Users browsing this forum: No registered users and 1 guest