Jackiebibili / algo-chatbox-nlp

Introduction to Algorithms by CLRS (Prof. Chida's UT Dallas Advanced Algorithms class) chatbot that can answer student questions related to the class syllabus, algorithms in general, and Big-O function comparison

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intro to Algorithms Class Chatbot

Prerequisites for Development

  • python 3.9 and pip
    • tensorflow
    • tensorflow_hub
    • tensorflow-text
    • (See details in "How to Run the Project" about installing other dependencies)
  • Haystack (see installation instructions)
  • Elasticsearch host (see details in "Miscellaneous" about elasticsearch)

Prerequisites for Deployment

  • Docker
  • Elasticsearch host (see details in "Miscellaneous" about elasticsearch)

How to Run the Project for Development

  • Change directory to src/server
  • Run shell script run.sh to prepare and store in your elasticsearch host the information of
    • i) question and answer (Q&A) pairs,
    • ii) algorithm general knowledge base and
    • iii) syllabus knowledge base.
  • Install pipenv by pip install pipenv
  • Run pipenv install to install backend dependencies listed in the Pipfile.lock
  • Run python -m src.api.app to start the Flask API, which then initiates the NLP pipeline

Project Deployment

  • Change directory to src/server
  • Run shell script prod.sh to build an image of the API and the NLP pipeline. Then, the backend will be running in a docker container. It has been tested that in Linux x86 system the docker image can be successfully built.
  • It is suggested that backend server should have a GPU for faster inferencing time.

Architecture

NLP pipeline visualization:

NLP pipeline visualization

Performance of QueryClassifier

Precision = 1.00
Recall = 0.83
F1 = 0.91
Confusion Matrix:

Confusion Matrix

Miscellaneous

  • In Fall 2022, the backend is hosted on the Google Cloud Platform. We utilized a Nvidia Tesla P100 (225W, 16G RAM) and 2 vCPU with 7.5G RAM. This results in 5s response time for each request on average.
    • A elasticsearch host should be running locally or on the cloud. We used to have a free-trial cloud elasticsearch host provided by elastic.co. Alternatively, elasticsearch instance can also be hosted in a docker container.

About

Introduction to Algorithms by CLRS (Prof. Chida's UT Dallas Advanced Algorithms class) chatbot that can answer student questions related to the class syllabus, algorithms in general, and Big-O function comparison


Languages

Language:Python 91.2%Language:Shell 6.8%Language:Dockerfile 2.0%