DataSenseiAryan / TS3000_TheChatBOT

Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic just like its creator :stuck_out_tongue_closed_eyes: BDW it uses Pytorch framework and Python3.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TS3000_TheChatBot

Tony Stark 3000 - The Chat Bot Its a very basic level conversational AI.

Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic like its creator 😝 :trollface: BDW it uses Pytorch framework and Python3

Data Preprocessing :

Downoad Reddit Data From Here

Follow These :

  • Put Downloaded data in Data_Preprocessing directory .

  • Unzip the .bz2 file bzip2 -dk filename.bz2

  • Run createDB.py

    python3 createDB.py

    This will create Database from Raw JSON text file which you unzipped earlier.

  • Run createCORPUS.py

    python3 createCORPUS.py

    This will create corpus .For example I created 2011-08small.txt

  • Move this created corpus to Data directory .


Trainining Model :

  • Start training model using this command :

    python3 main.py -tr data/2013-09small.txt -l -lr 0.0001 -it 50000 -b 64 -p 500 -s 1000

  • To resume training from last where yiu left :

    python3 main.py -tr data/2013-09small.txt -l save/model/2013-09small/1-1_512/3000_backup_bidir_model.tar -lr 0.0001 -it 50000 -b 64 -p 500 -s 1000


Testing Model :

  • To test the model in interactive mode :

    python3 main.py -te save/model/2013-09small/1-1_512/3000_backup_bidir_model.tar -c data/2013-09small.txt -i


Acknowledgements :


License :

MIT License

Copyright (c) 2019 Aryan Chaudhary

LICENSE

HitCount

About

Its a social networking chat-bot trained on Reddit dataset . It supports open bounded queries developed on the concept of Neural Machine Translation. Beware of its being sarcastic just like its creator :stuck_out_tongue_closed_eyes: BDW it uses Pytorch framework and Python3.

License:MIT License


Languages

Language:Python 77.2%Language:Jupyter Notebook 22.8%