cameron-martin / tsumego

An app for playing tsumego puzzles

Home Page:https://tsumego.app/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Change puzzle win probability

cameron-martin opened this issue · comments

#15 picks puzzles with 50% chance of solving. We must come up with a way of setting this probability to something different.

The following properties must hold:

  • The probability of the user solving puzzles given to them should be a fixed value. This arises from the notion that "puzzles should be of the correct difficulty" by just defining difficulty as the probability of the user solving a puzzle.
  • Every puzzle should have a non-zero probability of being shown to a user. This ensures that all puzzles will eventually get some games, and therefore have an opportunity to be rated.
  • Users should not be shown the same puzzle "too often". This is a bit of vague requirement currently.
  • The probability of a puzzle being chosen only depends on the rating and rating deviation.

Let be the probability that puzzle i is shown to the user. This is the distribution that we must sample from when showing the user a puzzle, but currently I do not know how to compute this distribution.

Let be the probability that the user wins. We want this to be a fixed value.

is the probability that a user wins given the puzzle is puzzle i. This is a known value and can be computed as shown in the glicko paper.

The relationship between these values is as follows:

It should be explicit that the probability puzzle is shown to the user depends on the score and rating deviation of both the puzzle and user, i.e. that the function is .

I think the best way forward is to assert a form for and optimise its parameters such that is relatively constant (within an interval?) over representative values of and for all . We may be able to source this from chess data - it seems reasonable to assume that the distribution of user/puzzle ratings and rating deviations is similar.

The obvious choice of form is polynomials of increasing degree, or, in the extreme, an artificial neural network.

Edit: I hadn't appreciated that depends on all puzzles ...

Alternatively, given a value of for a user and all puzzles (which we could recompute each rating period?), we could select puzzles such that the distribution of is approximately a chosen bounded domain distribution on [0, 1].