Overview:
This is an implementation of seq2seq network for designing a chatbot. This model is trained on Microsoft Research Social Media Conversation Corpus Dataset consist of a series of tweet Ids which form a dialog between two people. Data needs to be manupulated to be made fit for feeding to the model. After few hours of training, chatbot can hold an interesting conversation.
Dependencies:
- Numpy
- six
- nltk (for data preprocessing)
- tensorflow (version 1.1.0 will throw an error which they are going to fix in next release. Use version 1.0.0 instead.)
Other datasets that can also be used to train:
- Ubuntu dialog corpus v2.0 (https://github.com/rkadlec/ubuntu-ranking-dataset-creator)
- Your own chat data