sepehrgh98/Chatbot

Chatbot

Simple Retrieval-based & Generative-based chatbot in order to make conversation. 🗣
Explore the docs »

.

Table of Contents

About The Project
- Built With
Parts
Contact

About The Project

In this project, two kind of chatbot have been implemented.

Retrival_based

This chatbot uses context to answer questions. Every asking questions are answerd based on contexes from wikipedia. This chatbot is capable to recognize and answer questions with different structures and different words.

Generative_based

This chatbot recognizes questions and ask them itself. There is a seq-to-seq structure used to extract features from questions and produce answers.

(back to top)

Built With

Major frameworks/libraries used to this project:

(back to top)

Parts

Data preparation and augmentation

Retrival_based

In this project biographies of 500 well-known people in the world were used as dataset. All of this biographies was colleted from WikiPedia.
Generative_based

In this project a dataset of conversations in Amazon website is used. This dataset includes about 320,000 Q&A pairs which was collected from convertions between cosumers and operators. The word representation method in this project was dictionary lookup. . So after creating a collection of used words and allocating index to each word, the process of learning started. This dictionary also was useful in test stage to convert indexes to words.

Pre-processing

Retrival_based

All the texts were separated sentence by sentence
Remove words which do not have much impact on the train process
Create a dictionary from remained words (useful in both train and test stage)

Generative_based

All the Q&A pairs were separated sentence by sentence
Remove words which do not have much impact on the train process
Creating a collection of used words and allocating index to each word

Model

RNN based sequence to sequence model.

For each cell LSTM is used.

The model is designed to support the attention model.

Process

We feed all Q&A pairs to model in rder to extract features from input. All features will be stored in a matrix named 'contex vector' which contains a summery from input by the use of attention method. Finally, the output will be created word by word by use of 'contex vector' and previous words.

Contact

Sepehr Ghamari - @sepehrgh98 - sepehrghamri@gmail.com

Project Link: https://github.com/sepehrgh98/Speech-Recognition

(back to top)

sepehrgh98 / Chatbot

Chatbot

About The Project

Built With

Parts

Contact

About

Languages