adimyth / iplbot

A retrieval based chat bot - BotVic. Engage with BotVic about IPL and have fun

Home Page:https://ipl-qa-bot.herokuapp.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IPL BOT

A retreival based question & answering bot trained on IPL wikipedia pages. Built using Streamlit and deployed using Heroku - https://ipl-qa-bot.herokuapp.com/.

Demo

Libraries Used

  • Streamlit - For creating the web app
  • Scikit Learn - For training a Tfidf vectorizer
  • BeautifulSoup, Request - For extracting and parsing data

How to run the app

Clone the repository

git clone https://github.com/adimyth/iplbot.git

Install the requirements

pip install -r requirements.txt

Run the app

cd iplbot
streamlit run app.py

How it works

  1. Run extractor.py to extract text from the following list of wikipedia pages

  2. Given an input sentence, generate_response function in bot.py does the following

    • Lowercase the entire string
    • Removing punctuation marks
    • Word tokenization
    • Lemmatization
    • Train a tfidf vectorizer on the sentences generated in step 1 as well as on the input sentence
    • Uses cosine similarity to find the two closest vectors
    • Sorts the vector similarity in decreasing order & chooses the first vector
    • Gets the corresponding sentence & capitalizes it

Please ⭐ the repo and share it

About

A retrieval based chat bot - BotVic. Engage with BotVic about IPL and have fun

https://ipl-qa-bot.herokuapp.com/


Languages

Language:TeX 73.6%Language:Python 12.3%Language:Shell 11.4%Language:CSS 2.7%