Create a Bloom filter with lexemes from a Wikidata data dump (latest-lexemes.json.gz) and then use that to check if a dataset from the Swedish parliament contains words not already included.
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool