ianfab / chess-variant-puzzler

Puzzle generator for chess variants

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove puzzles that share the same solution

Belzedar94 opened this issue · comments

To avoid slight variations that lead to the same exact solution as chapters 2-3 of https://lichess.org/study/vxrJlFCV

For example I find this duplicate puzzle in autocorr created chennis collection (only difference is the so called supplied move "sm"):

2k4/1p+f4/3+s3/7/3m3/3+F1P1/4KM1[s] w - - 0 7;variant chennis;sm d2d5-;bm +S@d2;eval #2;difficulty 0.000;content 2.338;quality 0.197;volatility 0.000;volatility2 0.017;accuracy 0.000;accuracy2 0.071;std 0.000;ambiguity 0.000;type mate;pv d2d5-,+S@d2,e1d1,c6c2-
2k4/1p+f4/3+s3/7/3m3/5P1/3+FKM1[s] w - - 0 7;variant chennis;sm d1d5-;bm +S@d2;eval #2;difficulty 0.061;content 2.279;quality 0.194;volatility 0.012;volatility2 0.018;accuracy 0.012;accuracy2 0.060;std 0.000;ambiguity 0.000;type mate;pv d1d5-,+S@d2,e1d1,c6c2-

I would like to distinguish a few criteria here:

  1. identical starting position
  2. identical solution line
  3. identical final position

More sophisticated duplicates are unrealistic to cover I think, and also are within the lichess puzzles, e.g., the exact same mating pattern on different squares or with different material configurations.

For puzzles with identical starting positions, with default generator settings this should not occur, since it filters duplicate FENs already there. In the case of including the supplied move the duplicate filter could potentially be improved by using the resulting FEN for filtering instead of the FEN + move pair.

if (fen, bestmove) not in fens:
fens.add((fen, bestmove))

It might be a small hit on performance because you need to compute another FEN, but code-wise should be easy to change.

Identical solution lines from different starting positions are rather difficult to filter, since it is very hard to tell if the pattern is really the same. Also it might not even be a bad idea to have the same pattern in different contexts to generalize the pattern recognition. The only place where I used a very specific filtering of this kind so far was for Manchu, since probably 90% of your checkmate puzzles will just be the banner landing on c7/g7, which is super dull. For other variants I haven't seen anything like this though.

With regards to identical final positions, the main pattern occurring in practice is that a forced mate in n happened in the game. It will then report the mate in n, mate in n-1, ..., and mate in 1 all as separate puzzles. This looks very repetitive on a small scale when you look at the ordered list of resulting puzzles, and that is what some people have heavily criticized, but once puzzles are unordered/randomized I think their similarity hardly is a problem. Actually, having the same puzzle on different levels of difficulty IMO can be very useful. So I don't see a strong need for such a filtering.

So all in all the only minor improvement I currently see is to fix the duplicate filtering for the scenario of using the supplied move notation. Other than that apart from very specific problems like the Manchu one I do not consider duplicates a big problem so far.

Telling the truth, I just posted above example just as a curiosity. I completely agree with you. There is no real need to fix the supplied move case. I can delete one of them, if it occurs again.