Designing an analysis friendly Stockfish?

ernest · Post by **ernest** » Mon Feb 14, 2011 2:09 am

mcostalba wrote:After 5811 games Mod- Orig: 1037 - 902 - 3872 +8 ELO (+- 3.6) LOS 97%

What is (+- 3.6)?
I find the 95% probability to be (+- 5.3)...

BB+ · Post by **BB+** » Mon Feb 14, 2011 2:27 am

After 5811 games Mod- Orig: 1037 - 902 - 3872 +8 ELO (+- 3.6) LOS 97%

What is (+- 3.6)? I find the 95% probability to be (+- 5.3)...

66% draw ratio is fairly high, which could reduce the relative error bars compared to a formula based purely on #games played. In any case, a binomial comparison with 1037:902 is 99.9% LOS, no? EDIT: What I mean is, the chance of something as lopsided as 1037:902 between two equal programs is only 1 in 998.

mcostalba · Post by **mcostalba** » Mon Feb 14, 2011 7:37 am

BB+ wrote:
After 5811 games Mod- Orig: 1037 - 902 - 3872 +8 ELO (+- 3.6) LOS 97%
What is (+- 3.6)? I find the 95% probability to be (+- 5.3)...
66% draw ratio is fairly high, which could reduce the relative error bars compared to a formula based purely on #games played. In any case, a binomial comparison with 1037:902 is 99.9% LOS, no? EDIT: What I mean is, the chance of something as lopsided as 1037:902 between two equal programs is only 1 in 998.

LOS of 97% comes from bayeselo

error bar comes from: 40 / sqrt($wins + $losses + $draws) * 7

Uly · Post by **Uly** » Mon Feb 14, 2011 11:24 am

fruity wrote:SF 2.1 or whatever it will be called would be about 8 Elo weaker without your genius thread! So let me thank you.

, it's me who is thankful!

Peter C wrote:As far as I can tell, smooth scaling makes it weaker (sometimes significantly), but has a habit of suggesting interesting analysis moves. I put it in there mostly just because I could and it can be handy for analysis, but it for sure has a negative Elo value (which is why it's off by default). The parameters for it could be tuned a bit, maybe we can get something useful from it.

Oh, indeed, Smooth scaling improves playing style, at some point I was using the s version alongside the default version with tolerable levels of redundancy, I think I can do so once we get stable builds (nobody has suggested Stockfish Learning code yet).

Jeremy Bernstein · Post by **Jeremy Bernstein** » Mon Feb 14, 2011 12:48 pm

Uly, would you be interested in a version with lower granularity, as well? My benchmarks don't indicate that it's any slower, although it may well be weaker -- haven't done any play testing.

By "learning" do you mean persistent hash? I haven't given this any more thought, but again, it's not a terribly difficult problem. The principle issue is determining when to push a value to file. The standard strategy appears to log positions/moves which fail significantly after a certain amount of searching (where the delta value from one iteration to another exceeds a certain threshold). Seems like it would be useful to do the same for positions/moves which succeed significantly from one iteration to the next, but maybe that's less vital. I probably need to do some more reading on what's been tried so far before I attempt to propose something for SF.

Uly · Post by **Uly** » Mon Feb 14, 2011 1:36 pm

Jeremy Bernstein wrote:Uly, would you be interested in a version with lower granularity, as well?

Yes! One problem with Stockfish is that eventually after interactive analysis, several moves are tied in score, and one doesn't know where to continue. Going by number of positions isn't very efficient, and here one needs a different engine that helps Stockfish by guiding him into what move would be analyzed next (the one with the highest score).

Without granularity, the problem would be 1/4 less likely (or basically disappear, since with other engines moves that tie after interaction are very rare, unless it's a transposition).

As for learning, here's the suggested behavior:

The user sets a "Write depth", this is so that only useful content gets stored, and this may be a user decision depending on hardware or the position. I usually let Stockfish analyze for 1:20 and use the reached depth as write depth for my manual analysis file (that the engine can't see or use).

For example, let's assume the user sets Write Depth to 16, then Stockfish does analyzes this:

.15/14	 0:00 	+0.16--	1.Nf3 Nf6 2.e3 d5 3.d4 e6 4.Bd3 c5 5.O-O c4 6.Be2 Nc6 7.Nc3 Bd6 (509.807) 615
 15/09	 0:01 	+0.32++	1.e4 Nf6 2.e5 Nd5 3.Nf3 Nc6 4.d4 e6 5.Bd3 (928.123) 674
 15/09	 0:01 	+0.40++	1.e4 Nf6 2.e5 Nd5 3.Nf3 Nc6 4.d4 e6 5.Bd3 (1.152.385) 695
 15/09	 0:02 	+0.56++	1.e4 Nf6 2.e5 Nd5 3.Nf3 Nc6 4.d4 e6 5.Bd3 (1.424.940) 712
 15/16	 0:03 	+0.44 	1.e4 e5 2.Nf3 Bd6 3.Nc3 Nf6 4.Be2 Nc6 5.O-O O-O 6.d3 b6 7.d4 exd4 8.Nxd4 Bb7 (2.247.548) 730
 16/18	 0:04 	+0.24--	1.e4 e5 2.Nf3 Nf6 3.Nxe5 d6 4.Nf3 Nxe4 5.Bd3 Nf6 6.O-O Be7 7.Nc3 O-O 8.Re1 Nc6 9.Ng5 d5 (3.151.485) 733
 16/26	 0:05 	+0.32 	1.e4 e5 2.Nf3 Nf6 3.Nxe5 d6 4.Nf3 Nxe4 5.Qe2 Qe7 6.Nc3 Nxc3 7.dxc3 Qxe2+ 8.Bxe2 Nc6 9.Be3 Be7 10.O-O-O O-O 11.Kb1 Be6 12.Ng5 a6 13.Nxe6 fxe6 (4.052.349) 738

Now, "e2e4 d16 s32" (or something) is written to the file for the opening position, it is known that at depth 16, e2e4 has a score of 32. Even if the hash is cleared or Stockfish is reloaded, e2e4 won't need to be researched up to depth 16, and would be shown with score of 0.32 up to that point. It would resume normal search till depth 17, that would look like this.

.17/26	 0:05 	+0.32 	1.e4 e5 2.Nf3 Nf6 3.Nxe5 d6 4.Nf3 Nxe4 5.Qe2 Qe7 6.Nc3 Nxc3 7.dxc3 Qxe2+ 8.Bxe2 Nc6 9.Be3 Be7 10.O-O-O O-O 11.Kb1 Be6 12.Ng5 a6 13.Nxe6 fxe6 (4.252.062) 751

At this point in analysis, "e7e5 d16 s32" is written for the position after 1.e4, since the relative depth is 16. For the opening position, "e2e4 d17 s32" is overwritten.

.18/26	 0:06 	+0.32 	1.e4 e5 2.Nf3 Nf6 3.Nxe5 d6 4.Nf3 Nxe4 5.Qe2 Qe7 6.Nc3 Nxc3 7.dxc3 Qxe2+ 8.Bxe2 Nc6 9.Be3 Be7 10.O-O-O O-O 11.Kb1 Be6 12.Ng5 a6 13.Nxe6 fxe6 (4.601.681) 751

"g1f3 d16 s32" is written for the position after 1.e4 e5. For the position after 1.e4, "e7e5 d17 s32" is overwritten. For the opening position"e2e4 d18 s32" is overwritten.

.19/25	 0:07 	+0.40++	1.e4 e5 2.Nf3 Nf6 3.Nxe5 d6 4.Nf3 Nxe4 5.Qe2 Qe7 6.Nc3 Nxc3 7.dxc3 Qxe2+ 8.Bxe2 Nc6 9.Be3 Be7 10.O-O-O O-O 11.Kb1 Be6 12.Ng5 a6 13.h4 (5.619.232) 755
 19/19	 0:08 	+0.48++	1.e4 e5 2.Nf3 Nf6 3.d4 exd4 4.e5 Qe7 5.Be2 Ng4 6.Qxd4 d6 7.exd6 cxd6 8.Nc3 Nc6 9.Qa4 Qe6 10.O-O (6.811.131) 759
 19/26	 0:11 	+0.32 	1.e4 e5 2.Nf3 Nf6 3.d4 exd4 4.e5 Qe7 5.Be2 Ng4 6.Qxd4 h5 7.Nc3 Nc6 8.Qf4 Ncxe5 9.O-O c6 10.Re1 d6 11.h3 Nxf3+ 12.Qxf3 Ne5 13.Qf4 Be6 (8.935.601) 757

And here "g8f6 d16 s32" is written after 1.e4 e5 2.Nf3, "g1f3 d17 s32" after 1.e4 e5, "e7e5 d18 s32" after 1.e4 and "e2e4 d19 s32" overwritten on the opening position.

And so on.

So, after reloading Stockfish, at the beginning of the search it is aware of the learn contents and does not research the moves in the file.

These would be path independent, so that from the start Stockfish is aware of the contents for 1.Nf3 e5 2.e4 Nf6, 1.Nf3 Nf6 2.e4 e5, and 1.e4 Nf6 2.Nf3 e5 ("g8f6 d16 s32") and all their transpositions, and when faced in search they'd return s32 and be pruned.

Having theses learned contents does not mean the move is best, a user can do a long analysis for 1.f3 and Stockfish would acknowledge the learning and not research it, but it won't show it as best (this may have been an unnecessary mention but is a fatal flaw of Rybka 3 Persistent Hash, so it was for good measure).

And finally, the learning file should be user-friendly and allow the user to edit it, many times I'm sure the position is draw because it's a fortress but the engine doesn't see it and there are too many variations making it prohibitive even for learning. It would be really cool to edit the relevant text line, add "f8e7 d59 s0", and have Stockfish know that the postion is drawn and that all positions leading to it are drawn also.

(That's my proposal for Learning, I would like to avoid the name "Persistent Hash" because it's confusing, this is not a hash)

Jeremy Bernstein · Post by **Jeremy Bernstein** » Mon Feb 14, 2011 2:05 pm

Uly wrote:
Jeremy Bernstein wrote:Uly, would you be interested in a version with lower granularity, as well?
Yes! One problem with Stockfish is that eventually after interactive analysis, several moves are tied in score, and one doesn't know where to continue. Going by number of positions isn't very efficient, and here one needs a different engine that helps Stockfish by guiding him into what move would be analyzed next (the one with the highest score).

Without granularity, the problem would be 1/4 less likely (or basically disappear, since with other engines moves that tie after interaction are very rare, unless it's a transposition).

Try this build (64-bit). Should have a granularity of 2 now, but I can't say if it will perform any better. Also, if you notice a reversion to a granularity of 8 in endgame positions, please let me know.

Jeremy

Uly · Post by **Uly** » Mon Feb 14, 2011 2:13 pm

Thanks! Out of curiosity, what caused the granularity? It's expected that the more precise evaluation of positions would bring better results, so blurring the lines and making two moves that could really be 7 centipawns apart have the same score seems like a weird design choice.

Jeremy Bernstein · Post by **Jeremy Bernstein** » Mon Feb 14, 2011 2:17 pm

Uly wrote:Thanks! Out of curiosity, what caused the granularity? It's expected that the more precise evaluation of positions would bring better results, so blurring the lines and making two moves that could really be 7 centipawns apart have the same score seems like a weird design choice.

It's simply part of the code (there is a variable called GrainSize which is used in a few places, and sometimes it's hard-coded). Presumably, Marco and the boys found that a granularity of 8 performed better -- maybe he could comment. Anyway, give it a run. I'm curious if the increased subtlety brings any improvement.

Jeremy

Uly · Post by **Uly** » Mon Feb 14, 2011 3:03 pm

Well, yes, the improvement is self-evident, here's an example of a common scenario.

One is analyzing with a margin of 1 centipawn, which means, new mainlines are considered when they beat the old mainline by 0.01, but mainlines with the same score have to be tie-broken (this is trivial, one goes to the tail and forces moves until the score gets +0.01 or -0.01).

Now, assume some deep move of 1.e4 e5 2.Nf3 Nc6 3.Bb5 is scored as 0.16 and falls over the margin, which means one would like to examine early white alternatives that are >0.16. But there's 1.e4 e5 2.Nf3 Nc6 3.Bc4 with a score of 0.16, so one has to examine it deeper until it goes higher (becomes new mainline) or lower (gets refuted so one searches for another alternative, or goes earlier in the mainline to search for better moves, or goes back to extending Bb5).

With granularity=8 one has to examine 3.Bc4 until it goes to either 0.24 or 0.08, which will take a while longer than with granularity=1 where it's very likely that Bb5 and Bc4 don't tie at all, so Bc4 becomes new mainline automatically or the margin can be higher.

This is just an example of an analysis method, but low granularity is conceptually better, for complex positions that don't transpose where one doesn't want to have to refute futile variations, one would even wish to have millipawns granularity available, or something.

Your implementation is working great, since it was granularity=2 I had expected to see only odd or even evaluations but I'm seeing jumps of one centipawn (such as from -0.96 to -0.95) without problems.

OpenChess

OpenChess

Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?

Re: Designing an analysis friendly Stockfish?