Relationship between move ordering and pruning

Code, algorithms, languages, construction...
User avatar
Don
Posts: 42
Joined: Thu Dec 30, 2010 12:28 am
Real Name: Don Dailey

Relationship between move ordering and pruning

Post by Don » Mon Dec 17, 2012 3:47 pm

Over the years there has been a constant tendency to increase the amount of pruning and reductions in my program. What is a bit odd about this is that what didn't work before appears to work now. We do very aggressive forward pruning as well as LMR. So why is this?

One of the the things that has improved over time is our move ordering. We have the classic stuff as well as modern improvements and a couple twists of our own, all designed to improve move ordering. Probably the biggest advance in move ordering beyond the classic capture MVVLVA and killer stuff is the history heuristic. That has been around a very long time but not in it's current form which is basically a rapidly "aged" table of move success rates. So I decided to turn that off in an experiment to see how much it "buys" in terms of speed. In the past move ordering was considered a performance issue only - a way to speed up the search but not improve the strength of the program at any particularly depth. But that is no longer the case and probably hasn't been for a long time. Nevertheless I was still surprised when I found that at 10 ply fixed depth searches it was worth 150 ELO. (This test actually finished at 151 ELO plus - but I only have the older data.)

My tentative conclusion here (which is something I think most people already believe) is that heavy pruning, reductions and move ordering are very complimentary. You cannot successfully do heavy pruning without exceptionally good move ordering.

Anecdotal evidence points to this too. Many times I have seen people report that LMR is not worth much in their program. In Komodo it is worth well over 100 ELO at fast time controls. The first time I implemented LMR in Doch I also only was able to obtain a few ELO. And if I was not extremely careful and conservative I found that it had a negative impact on the strength of the program. But today's implementation looks reckless by comparison.

Here is some old data on this run at 10 ply fixed depth. I often use fixed depth to try to isolate performance numbers, never to measure ELO but the difference in ELO here is so huge it is not subject to debate.

Code: Select all

Rank    ELO     +/-    Games    Score  Player
---- ------- ------ -------- --------  ----------------------------
   1  3146.3   20.1      749   69.960  Base
   2  3000.0   20.1      749   30.040  NoHistory

w/l/d: 255 206 288    38.45 percent draws

      TIME       RATIO    log(r)     NODES    log(r)  ave DEPTH    GAMES   PLAYER
 ---------  ----------  --------  --------  --------  ---------  -------   -------
    0.0788       0.657    -0.420     0.058    -0.463     9.9964      749   Base
    0.1200       1.000     0.000     0.091     0.000     9.9975      749   NoHistory

lucasart
Posts: 201
Joined: Mon Dec 17, 2012 1:09 pm
Contact:

Re: Relationship between move ordering and pruning

Post by lucasart » Tue Dec 18, 2012 3:08 pm

Don wrote:Over the years there has been a constant tendency to increase the amount of pruning and reductions in my program. What is a bit odd about this is that what didn't work before appears to work now. We do very aggressive forward pruning as well as LMR. So why is this?

One of the the things that has improved over time is our move ordering. We have the classic stuff as well as modern improvements and a couple twists of our own, all designed to improve move ordering. Probably the biggest advance in move ordering beyond the classic capture MVVLVA and killer stuff is the history heuristic. That has been around a very long time but not in it's current form which is basically a rapidly "aged" table of move success rates.
This is true, but at the same time rather obvious. The key breakthrough that allowed LMR to be performant is the combination of Fabien's idea of using the history value for reduction (what he then called "history pruning") and Tord's idea of how to make a better history mechanism.

Treatement of captures hasn't change before/after the Fruit/Glaurung breakthrough. Still sorted by MVV/LVA (or SEE). But sorting quiet moves by history and using the history value (or move count which amounts to the same) to make better reduction decision is *the* breakthrough.

Now it's all about "how"
- how to make history tables better: Tord's implementation with piece * square and +/- depth^2 increment was quite an improvement compared to doing +/-1 to the history table (like in Fruit and probably most previous programs). I've also found that the ceiling value after which you reduce the array to prevent overflow is quite impactful (history should have a too long or too short memory so to speak).
- tuning the reduction as a function of the move count etc. Glaurung further extended Fruit's history pruning here: better history table (see above) fractional ply reduction with a well tuned smooth increase in reduction as the move count count increases (or as the history value decreases, take your pick).

Quiet moves are extremely important, because:
1/ there are *lots* of them
2/ therefore sorting them well is of utmost importance
3/ and reducing them too. making good decisions on which quiet moves to reduce is extremely important. come to think of it that's what humans do too! Besides, no reducing a capture isn't so bad: one piece less on the board followed by exchanges etc means less on the board hence smaller branching factor. And captures are important (we musn't miss long and complex capture sequences).

Of course making a modification to history mechanism (and sorting in general) should be considered carefully as it is very intertwined with LMR. For example to experiment with a history table mechanism, I think it makes more sense to switch off any kind of LMR and compare the effect of that only. Good history is good moves sortedf first, hence an improvement over a plain and simple alpha/beta. Then a good LMR comes on top of a good history. Not the other way around or the two at the same time.

Same goes for futility pruning (if move count dependant like in SF) or move count pruning. I think one should optimize sorting as much as possible on a plain alpha/beta (with PVS hash table and everything except LMR and move count dependant reductionb/pruning) before implementing/tuning LMR/move count pruning/futility pruning.

User avatar
Don
Posts: 42
Joined: Thu Dec 30, 2010 12:28 am
Real Name: Don Dailey

Re: Relationship between move ordering and pruning

Post by Don » Tue Dec 18, 2012 4:54 pm

lucasart wrote:
Don wrote:Over the years there has been a constant tendency to increase the amount of pruning and reductions in my program. What is a bit odd about this is that what didn't work before appears to work now. We do very aggressive forward pruning as well as LMR. So why is this?

One of the the things that has improved over time is our move ordering. We have the classic stuff as well as modern improvements and a couple twists of our own, all designed to improve move ordering. Probably the biggest advance in move ordering beyond the classic capture MVVLVA and killer stuff is the history heuristic. That has been around a very long time but not in it's current form which is basically a rapidly "aged" table of move success rates.
This is true, but at the same time rather obvious.
It's always been obvious to me, but apparently this is not as obvious as you think because many smart people are not aware of how strong that relationship is. For example even Bob Hyatt didn't much believe in history and has been critical of how LMR works - not believing it should have anything to do with the moves position in the list. I think he has fairly recently changed his viewpoint somewhat but you can see his thinking from posts that are not that old. I am very careful about what I say is "obvious" these days because everything becomes obvious once you are educated to it. It's obvious to me the earth is not flat.

I sometimes make posts like this to education those who need to be educated. This was a golden opportunity to prove the powerful connection.


The key breakthrough that allowed LMR to be performant is the combination of Fabien's idea of using the history value for reduction (what he then called "history pruning") and Tord's idea of how to make a better history mechanism.

Treatement of captures hasn't change before/after the Fruit/Glaurung breakthrough. Still sorted by MVV/LVA (or SEE). But sorting quiet moves by history and using the history value (or move count which amounts to the same) to make better reduction decision is *the* breakthrough.
Yes, I agree that this is a very accurate assessment of the state of the art and why it works so well when done right.

Please note that the traditional move ordering was until recently considered "excellent" or state of the art. It generally follows a standard formula: 1. hash table move, 2. MVV/LVA captures, 3. killers 4. "all the rest." A more recent improvement is SEE to place the losing capture either to the end of the capture list or after the killers. Category 4 moves were sometimes sorted by some variant of Jonathon Shaeffer style of history which was a non-trivial gain but apparently is not worth much if anything these days (in that form.) Some programs used other heuristics such as moves to the center first or piece value table sorted but none of these were anything significant and do not compare to the modern way that you are explaining.

Now it's all about "how"
- how to make history tables better: Tord's implementation with piece * square and +/- depth^2 increment was quite an improvement compared to doing +/-1 to the history table (like in Fruit and probably most previous programs). I've also found that the ceiling value after which you reduce the array to prevent overflow is quite impactful (history should have a too long or too short memory so to speak).
- tuning the reduction as a function of the move count etc. Glaurung further extended Fruit's history pruning here: better history table (see above) fractional ply reduction with a well tuned smooth increase in reduction as the move count count increases (or as the history value decreases, take your pick).

Quiet moves are extremely important, because:
1/ there are *lots* of them
2/ therefore sorting them well is of utmost importance
3/ and reducing them too. making good decisions on which quiet moves to reduce is extremely important. come to think of it that's what humans do too! Besides, no reducing a capture isn't so bad: one piece less on the board followed by exchanges etc means less on the board hence smaller branching factor. And captures are important (we musn't miss long and complex capture sequences).

Of course making a modification to history mechanism (and sorting in general) should be considered carefully as it is very intertwined with LMR. For example to experiment with a history table mechanism, I think it makes more sense to switch off any kind of LMR and compare the effect of that only. Good history is good moves sortedf first, hence an improvement over a plain and simple alpha/beta. Then a good LMR comes on top of a good history. Not the other way around or the two at the same time.
Recently I have done some experimentation with improving the way I score history in Komodo. I did not isolate LMR from this experiment as you suggest but without exception I found that anything that improved the strength of the program correlated almost perfectly with increase depth as well as ELO. For example if I tested at fixed depth levels the version that was faster also had a higher ELO. There was no trade-off in this case where you gain some speed at the sacrifice of ELO and have to pick the best trade-off, as most things are. I agree that this doesn't prove the move ordering was actually "better", it is possible that it was only better when combined with LMR - but that's really all that I care about anyway. I would not want to tune it without LMR only to find that I have to re-tuned it for LMR. But I do sometimes do experiments to isolate one thing from another as it can lead to new insights.

It's not just LMR that is at stake here either. We do some pretty severe forward pruning too which is move order sensitive.

Same goes for futility pruning (if move count dependant like in SF) or move count pruning. I think one should optimize sorting as much as possible on a plain alpha/beta (with PVS hash table and everything except LMR and move count dependant reductionb/pruning) before implementing/tuning LMR/move count pruning/futility pruning.
I don't consider that true futility pruning, at least not in the classical sense. Classical futility takes you into quies a ply early - and an extension of the idea is to perhaps go into quies a couple of ply early. Komodo does not go into quies early as such, it just treats a futility node as if it were the first ply of quies (captures and checks) but then returns to the main search. We still call it futility pruning but that is semantics. Stockfish on the other hand has a more sophisticated mechanism which is very futility-like but changes the margin based on the move number and can include quiet moves and also examines it's relationship to the so-called "threat move" which is a by-product of the null move search.

We actually have similar mechanisms too but we don't have any consideration for the threat move - we do not' see it as futility but it's forward pruning based on the move count with some tactical considerations too.

Some of this is just terminology and semantics. The deeper principles are the same and any good program makes use of them. Looking at the chessprogramming wiki I see that definitions of various techniques tend to "evolve" or change over the years. A prime example is the definition of "highly selective" compared to "brute force" which has undergone a complete transformation. What is now considered "brute force" would have been considered ridiculously selective 30 years ago. I blame this on a lack of historical perspective.

User avatar
Rebel
Posts: 515
Joined: Wed Jun 09, 2010 7:45 pm
Real Name: Ed Schroder

Re: Relationship between move ordering and pruning

Post by Rebel » Tue Dec 18, 2012 9:28 pm

Good stuff.... Once people get the logic behind LMR the relationship with move-ordering is automatic. I recently (for the umpteenth time) tried all kind of forms of history mechanisms to improve move-ordering for an even better LMR performance and once again did not succeed. In fact I never was able to find an improvement in all those years to the current system as described on my page.

My last effort to improve move-ordering was the (what I called) TRS-80 approach, the time of 16,000 bytes and thus no TT. Going one ply deeper I generated all moves, evaluated them (plus static adjustment for hanging pieces), sorted them on score and then searched them one by one by the highest score. This all became in vain with the introduction of the TT which was superior at that time to the TRS-80 system and still is as I recently found out. But my eval is slow, perhaps there is value in the idea for others, because the differences in speed-up / slow-down are very small.

One more thing, to measure move-ordering performance I use the Hsu formula: in the main search (not QS) I count the number of Alpha-Beta tries. If succeeded && it is the first move a second counter is incremented so one can calculate a percentage. If the move ordering is good it usually gets 92-94% in the middle game and a 96+% in the end game sometimes even peeking to 99%.

Perhaps it is an idea to pick a couple of hundred random positions, measure the average move-ordering percentage and compare programs.

User avatar
Don
Posts: 42
Joined: Thu Dec 30, 2010 12:28 am
Real Name: Don Dailey

Re: Relationship between move ordering and pruning

Post by Don » Tue Dec 18, 2012 10:01 pm

Rebel wrote:Good stuff.... Once people get the logic behind LMR the relationship with move-ordering is automatic. I recently (for the umpteenth time) tried all kind of forms of history mechanisms to improve move-ordering for an even better LMR performance and once again did not succeed. In fact I never was able to find an improvement in all those years to the current system as described on my page.

My last effort to improve move-ordering was the (what I called) TRS-80 approach, the time of 16,000 bytes and thus no TT. Going one ply deeper I generated all moves, evaluated them (plus static adjustment for hanging pieces), sorted them on score and then searched them one by one by the highest score. This all became in vain with the introduction of the TT which was superior at that time to the TRS-80 system and still is as I recently found out. But my eval is slow, perhaps there is value in the idea for others, because the differences in speed-up / slow-down are very small.

One more thing, to measure move-ordering performance I use the Hsu formula: in the main search (not QS) I count the number of Alpha-Beta tries. If succeeded && it is the first move a second counter is incremented so one can calculate a percentage. If the move ordering is good it usually gets 92-94% in the middle game and a 96+% in the end game sometimes even peeking to 99%.

Perhaps it is an idea to pick a couple of hundred random positions, measure the average move-ordering percentage and compare programs.
It's been a while since I last read your awesome page that reveals many of Rebels' secrets. I don't remember how you move ordering history scheme works but I will take another look.

My history scores are maintained incrementally. A move is either best or just another move and instead of periodically reducing the table to prevent overflow I simply use a formula that weights recent entries more. for example newval = oldval * 0.99 + 0.01 for an update and newval = oldval * 0.99 if it's not best. That's not the exact formula I use but it has the same basic form. It will gradually "forget" the affect of older updates and I don't need two entries. And of course it cannot overflow as it asymptotically approaches 1 in my example. My actual formula is designed to asymptotically approach some large value that fits in 16 bits and it's integer based and how quickly it forgets has been carefully tuned.

User avatar
Don
Posts: 42
Joined: Thu Dec 30, 2010 12:28 am
Real Name: Don Dailey

Re: Relationship between move ordering and pruning

Post by Don » Tue Dec 18, 2012 10:30 pm

Just out of curiosity, what do you feel is the relationship between evaluation quality and LMR/pruning?

Don

Don wrote:
Rebel wrote:Good stuff.... Once people get the logic behind LMR the relationship with move-ordering is automatic. I recently (for the umpteenth time) tried all kind of forms of history mechanisms to improve move-ordering for an even better LMR performance and once again did not succeed. In fact I never was able to find an improvement in all those years to the current system as described on my page.

My last effort to improve move-ordering was the (what I called) TRS-80 approach, the time of 16,000 bytes and thus no TT. Going one ply deeper I generated all moves, evaluated them (plus static adjustment for hanging pieces), sorted them on score and then searched them one by one by the highest score. This all became in vain with the introduction of the TT which was superior at that time to the TRS-80 system and still is as I recently found out. But my eval is slow, perhaps there is value in the idea for others, because the differences in speed-up / slow-down are very small.

One more thing, to measure move-ordering performance I use the Hsu formula: in the main search (not QS) I count the number of Alpha-Beta tries. If succeeded && it is the first move a second counter is incremented so one can calculate a percentage. If the move ordering is good it usually gets 92-94% in the middle game and a 96+% in the end game sometimes even peeking to 99%.

Perhaps it is an idea to pick a couple of hundred random positions, measure the average move-ordering percentage and compare programs.
It's been a while since I last read your awesome page that reveals many of Rebels' secrets. I don't remember how you move ordering history scheme works but I will take another look.

My history scores are maintained incrementally. A move is either best or just another move and instead of periodically reducing the table to prevent overflow I simply use a formula that weights recent entries more. for example newval = oldval * 0.99 + 0.01 for an update and newval = oldval * 0.99 if it's not best. That's not the exact formula I use but it has the same basic form. It will gradually "forget" the affect of older updates and I don't need two entries. And of course it cannot overflow as it asymptotically approaches 1 in my example. My actual formula is designed to asymptotically approach some large value that fits in 16 bits and it's integer based and how quickly it forgets has been carefully tuned.

User avatar
Don
Posts: 42
Joined: Thu Dec 30, 2010 12:28 am
Real Name: Don Dailey

Re: Relationship between move ordering and pruning

Post by Don » Tue Dec 18, 2012 11:05 pm

Rebel wrote:Good stuff.... Once people get the logic behind LMR the relationship with move-ordering is automatic. I recently (for the umpteenth time) tried all kind of forms of history mechanisms to improve move-ordering for an even better LMR performance and once again did not succeed. In fact I never was able to find an improvement in all those years to the current system as described on my page.
I understand from your page that you simply add 5 if the move is good (fail high) and subtract 1 otherwise. Are you saying that you have not been able to improve on that?

And you have a separate heuristic for move ordering - you do not use this table except for reductions? If I understand it I could experiment to see if any of these differences makes any difference with Komodo. I use this table for move ordering which of course affects move number based LMR. Larry and I have this long todo list of things to try and one thing on that list is to experiment with folding in the history score to the LMR decision.

My last effort to improve move-ordering was the (what I called) TRS-80 approach, the time of 16,000 bytes and thus no TT. Going one ply deeper I generated all moves, evaluated them (plus static adjustment for hanging pieces), sorted them on score and then searched them one by one by the highest score. This all became in vain with the introduction of the TT which was superior at that time to the TRS-80 system and still is as I recently found out. But my eval is slow, perhaps there is value in the idea for others, because the differences in speed-up / slow-down are very small.

One more thing, to measure move-ordering performance I use the Hsu formula: in the main search (not QS) I count the number of Alpha-Beta tries. If succeeded && it is the first move a second counter is incremented so one can calculate a percentage. If the move ordering is good it usually gets 92-94% in the middle game and a 96+% in the end game sometimes even peeking to 99%.

Perhaps it is an idea to pick a couple of hundred random positions, measure the average move-ordering percentage and compare programs.

lucasart
Posts: 201
Joined: Mon Dec 17, 2012 1:09 pm
Contact:

Re: Relationship between move ordering and pruning

Post by lucasart » Wed Dec 19, 2012 12:15 am

Don wrote:Just out of curiosity, what do you feel is the relationship between evaluation quality and LMR/pruning?

Don
That is a tricky one. I really don't know for sure. But I'll still try a guess: perhaps the search behaves better with a more continuous eval ?
"Talk is cheap. Show me the code." -- Linus Torvalds.

User avatar
Don
Posts: 42
Joined: Thu Dec 30, 2010 12:28 am
Real Name: Don Dailey

Re: Relationship between move ordering and pruning

Post by Don » Wed Dec 19, 2012 12:32 am

lucasart wrote:
Don wrote:Just out of curiosity, what do you feel is the relationship between evaluation quality and LMR/pruning?

Don
That is a tricky one. I really don't know for sure. But I'll still try a guess: perhaps the search behaves better with a more continuous eval ?
I don't have any sort of proof either, just a hunch. But I believe that a high quality evaluation function has a positive impact on selectivity. My idea here is that the better the evaluation function, the better the moves will be ordered in the absolute chess sense. That should not matter if the program doesn't know the difference, but the search sometimes DOES know the difference. So if you play a bad move because the evaluation doesn't understand, it will eventually show up as a problem that the search has to solve.

A silly example may help. Suppose your program doesn't understand that some move leaves a weak pawn? The evaluation doesn't know about that but the search is forced to defend that weak pawn. I have made the point before that search can enhance the positional play of a program significantly by means of this basic mechanism. Pressure on what it knows can force it to sense what it does not know. So I believe that good evaluation has a compounding effect like money in the bank that compound daily. The deeper you search the more important good evaluation becomes.

Imagine a program that can look 100 ply deep but has only material scoring for an evaluation function. It would probably appear to be playing incredibly strong positional chess. It would simulate to a large extent the missing evaluation features. But because it has poor understanding in general the move order would be horrendous.

User avatar
Rebel
Posts: 515
Joined: Wed Jun 09, 2010 7:45 pm
Real Name: Ed Schroder

Re: Relationship between move ordering and pruning

Post by Rebel » Wed Dec 19, 2012 12:49 am

Don wrote:
Rebel wrote:Good stuff.... Once people get the logic behind LMR the relationship with move-ordering is automatic. I recently (for the umpteenth time) tried all kind of forms of history mechanisms to improve move-ordering for an even better LMR performance and once again did not succeed. In fact I never was able to find an improvement in all those years to the current system as described on my page.
I understand from your page that you simply add 5 if the move is good (fail high) and subtract 1 otherwise. Are you saying that you have not been able to improve on that?

And you have a separate heuristic for move ordering - you do not use this table except for reductions? If I understand it I could experiment to see if any of these differences makes any difference with Komodo. I use this table for move ordering which of course affects move number based LMR. Larry and I have this long todo list of things to try and one thing on that list is to experiment with folding in the history score to the LMR decision.
I think you are looking at the wrong page :D That was my initial trial to get history reductions to work and gave 20-30 elo. The history table itself was not used for move-ordering and does not exist any longer. The page is out-dated since I found out that LMR (without history tables) simply gives more. For move-ordering the right place to look for is Move Ordering (1).

Post Reply