UmerTariq1 / Twitter-Data-Format-Change

Twitter Data Format For Chatbot Training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Twitter-Data-Format-Change

Twitter Data Format For Chatbot Training

This is a project to change the format of data of twitter customer support found at: https://www.kaggle.com/thoughtvector/customer-support-on-twitter according to the required format for sample project of pytorch's chatbot tutorial. https://pytorch.org/tutorials/beginner/chatbot_tutorial.html

Chatbot model training project is given by PyTorch on their website as a sample project. It uses Cornell movie dialog corpus. But sometimes, this chatbot does not work as expected because of the kind of limited data it is provided i.e movie dialogs data. On the other hand, The Customer Support on Twitter dataset is a large, modern corpus of tweets and replies to aid innovation in natural language understanding and conversational models, and for study of modern customer support practices and impact. It consists of tweets and their respective replies and in-response-tweets. The format of customer support data is nothing like the data format required for training chatbot model given by PyTorch.

So, the goal of this project was to provide the kaggle's twitter customer support data in the format which can be used by the pytorch's chatbot model.

About

Twitter Data Format For Chatbot Training


Languages

Language:Jupyter Notebook 100.0%