zeionara / neeko

User and project name generator, which makes nicknames following given rules

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

neeko

testing

User and project name generator, which makes nicknames following given rules.

Install dependencies

To set up environment use the provided setup.sh script, which will download and install required version of ballerina compiler for you:

./setup.sh

Pack neeko package and push it to the local repo:

cd neeko
bal pack && bal push --repository local
cd -

Run the project

To run the project use the following command (from the root of the cloned repo) to generate an inverted index and save it locally as assets/index.bin:

bal run index -- -CmaxNgramLength=3

For higher lengths of ngrams it is recommended to split up generated index into multiple components which are uploaded into memory separately during search phase (in this case you should also provided a folder name in which index segments will be saved as binary files, if such folder already exists, it will be overwritten, if file exists with the same name, the program will crash with error):

bal run index -- -CmaxNgramLength=5 -CnSegments=2 -CindexPath=assets/index

The index is generated from a dictionary of english words.

Then you can execute command for searching required words:

bal run search -- -Cngrams='ne eko  ah ri' -CtopN=5

Alternatively, if your index consists of multiple files, you should provide path to the respective folder:

bal run search -- -Cngrams='ne eko  ah ri' -CtopN=5 -CindexPath=assets/index

The command allows to find all words that match all of character n-grams separated by single space or any group separated by two spaces. The otuput looks like this:

Matched words:

mahri
uriah
mahori
meriah
pahari

Precomputed indices

The project comes with two precomputed indices kept in github lfs:

  1. assets/index.bin - monolithic index which supports ngrams with length 3 or less;
  2. assets/index - 3-segment index which supports ngrams with length 5 or less.

Test

To run tests use the following command:

bal test index

About

User and project name generator, which makes nicknames following given rules

License:Apache License 2.0


Languages

Language:Ballerina 95.5%Language:Shell 4.5%