hardikp / fnlp

[OUTDATED] A set of classes/scripts for NLP tasks focused on finance data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

fnlp

This repo contains scripts to train NLP models using the text data.

Dependencies

Train new GloVe vectors

glove.py contains a GloVe model written in pytorch. dataset.py contains a Dataset class - it is written in a way so that torch.utils.data.DataLoader utility class of pytorch can be used for training.

$ python3 glove.py --input wiki_data.txt --batch_size 512

Check the word Vectors

Trained word vectors are available on the releases page.

Let's check if the closest words make sense.

$ python3 test_word_vectors.py --word IRA
roth, iras, sep, 401, contribute

$ python3 test_word_vectors.py --word option
call, options, put, exercise, underlying

$ python3 test_word_vectors.py --word stock
shares, share, market, stocks, price

Notes

This CPU-only implementation is not yet optimized. For training on CPU, it might be best to download the Glove software from here.

Credits

License

MIT

About

[OUTDATED] A set of classes/scripts for NLP tasks focused on finance data

License:MIT License


Languages

Language:Python 100.0%