This repository contains the final project for the LLM Zoom Camp course. It demonstrates the application of Retrieval-Augmented Generation (RAG) techniques using StackExchange data to answer questions related to foundational neural network concepts.
This chatbot leverages Retrieval-Augmented Generation (RAG) to answer questions about neural networks. It combines a retriever to find relevant answers from a curated dataset and a generator to deliver clear, concise responses, making it a valuable tool for users seeking foundational knowledge in neural networks.
The data for this project was gathered using the StackExchange API and focused on fundamental questions related to neural networks. The collected answers vary in length. The Gemini 1.0 pro model was used to create a summary of the answers. You can find the data here
Type | Hit Rate | MRR |
---|---|---|
text_elasticsearch | 0.5828 | 0.43964666666666774 |
text_customsearch | 0.5572 | 0.4396800000000009 |
Type | Hit Rate | MRR |
---|---|---|
question_vector_elasticsearch | 0.6256 | 0.5150333333333336 |
answer_vector_elasticsearch | 0.8308 | 0.7089066666666664 |
question-answer_vector_elasticsearch | 0.8548 | 0.7323599999999993 |
custom-combined_vector_scoring_elasticsearch | 0.832 | 0.7055066666666656 |
mistral-7b-instruct-v0.1 cosine similarity(original answers vs llm generated answers)
Count | Mean | Std Dev | Min | 25% | 50% | 75% | Max |
---|---|---|---|---|---|---|---|
2500.000000 | 0.709169 | 0.157913 | -0.068219 | 0.621302 | 0.741953 | 0.825930 | 0.986987 |
llama-2-7b-chat-int8 cosine similarity(original answers vs llm generated answers)
Count | Mean | Std Dev | Min | 25% | 50% | 75% | Max |
---|---|---|---|---|---|---|---|
2500.000000 | 0.675582 | 0.161148 | -0.020848 | 0.582057 | 0.705028 | 0.792661 | 0.981918 |
Relevance of original answers and LLM-generated answers using llama-2-7b-chat-int8
Category | Count |
---|---|
Relevant | 85 |
Partly Relevant | 54 |
Non-Relevant | 11 |
Relevance of LLM-generated answers to questions using llama-2-7b-chat-int8
Category | Count |
---|---|
Relevant | 74 |
Partly Relevant | 70 |
Non-Relevant | 6 |
- Data: Stackapps API
- LLM: Gemini, Mistral, llama, Ollama, cloudflare
- Knowledge base: TF-IDF search, Elasticsearch, Lancedb
- Interface: Streamlit
The following processes are required to run Elasticsearch and Ollama.
- Run Elasticsearch
docker run -it \
--rm \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.4.3
- Run Ollama
docker run -it \
--rm \
-v ollama:/root/.ollama \
-p 11434:11434 \
--name ollama \
ollama/ollama
- Run gemma 2b
docker exec -it ollama ollama run gemma:2b
- Install requirements
pip install requirements
- Run streamlit app
streamlit run streamlit_app/app.py