Random word generator library is tricky
k8hertweck opened this issue · comments
That's a shame, because there are only two random word generation libraries that I could find:
random_word
(which is the unstable one)RandomWordGenerator
The latter will be more stable because it isn't actually sourcing real words from anywhere, it just mashes together a random selection of letters to form "words".
Despite the downside of not having a list of real words, we should probably use RandomWordGenerator
given the issues with the other package. The relevant part of the testing chapter would read as follows:
Fortunately, a Python library called RandomWordGenerator
exists to do just that. We can install it using pip
, the Python Package Installer:
$ pip install Random-Word-Generator
Borrowing from the word count distribution we created for test_alpha
, we can then create a text file full of random words with a frequency distribution that corresponds to an α of approximately 1.0:
import numpy as np
from RandomWordGenerator import RandomWord
max_freq = 600
word_counts = np.floor(max_freq / np.arange(1, max_freq + 1))
rw = RandomWord()
random_words = rw.getList(num_of_words=max_freq)
writer = open('test_data/random_words.txt', 'w')
for index in range(max_freq):
count = int(word_counts[index])
word_sequence = f"{random_words[index]} " * count
writer.write(word_sequence + '\n')
writer.close()
Included in #560
resolved by d78fd0e
I had released a new version on PyPI (https://pypi.org/project/Random-Word/1.0.6/), which will fix recent issues. Next weekend, I will refactor the whole repository to support multiple sources like Oxford, etc
PS: I didn't know that this small project would blow up, as I made this because I wanted to use this in one of my projects at university.
Cc @k8hertweck / @DamienIrving