Probability distribution of patterns and entropy calculation

Question

Probability distribution of patterns and entropy calculation

jameelhassan opened this issue 2 years ago · comments

As per the source code below, to calculate the pattern distribution, it uses possible_words to generate the distribution. Instead, shouldn't we use allowed_words for this, since any of the allowed words can be generating a given pattern.
ie: the function call should be get_pattern_matrix(allowed_words, allowed_words) ??

def get_pattern_distributions(allowed_words, possible_words, weights):
    """
    For each possible guess in allowed_words, this finds the probability
    distribution across all of the 3^5 wordle patterns you could see, assuming
    the possible answers are in possible_words with associated probabilities
    in weights.
    It considers the pattern hash grid between the two lists of words, and uses
    that to bucket together words from possible_words which would produce
    the same pattern, adding together their corresponding probabilities.
    """
    pattern_matrix = get_pattern_matrix(allowed_words, possible_words)

Jameel Hassan · Answer 1 · Thu Jul 21 2022 01:56:50 GMT+0800 (China Standard Time)

Got this sorted. The codebase as it is, is correct

Lets say I choose the word CRANE.
In order to get all greys, it means that the ANSWER shud not have any of the letters in “crane”. So the probability of getting all greys is the number of ANSWER WORDS without any of the letters in “crane” divided by the total number of words in the answer list.