bkamapantula / workshop-textbook-search

A search application for textbooks.

Creating a textbook search app

This workshop

is an introduction to the Python ecosystem of embeddings, vector databases.
demonstrates how to build a search app
is designed to be delivered in-person.
not a deep dive in to all the technologies involved.

Google Colab notebook

Visit https://bit.ly/techspace-llm-workshop for the workshop material on Google colab notebook.

Prerequisites

I recommend to keep the workshop contained in a conda environment, if you can.

A machine or an environment (Google Colab or Kaggle etc.) that supports:

Python
LangChain
- We will use Chroma, an open-source and lightweight embedding database.
Pandas, for data transformations
SQLite, SQLite browser to view the records.
FastAPI, uvicorn
Create an API Key with OpenAI. Sign-up and create an API key.

Outline

What are embeddings?
What are vector databases?
What is Retrieval Augmented Generation (RAG)?
Building a search engine
- Select textbook (will be preselected)
- Create chunks from the book pages using LangChain text splitter utilities
- Embed chunks in Chroma
- Build a query service
- RAG to summarize the user question
- Host with FastAPI, if time permits
Troubleshooting
Q&A, Discussion
Appendix
- Tooling
- Python Ecosystem

Skeleton utilities for all these will be provided for the workshop.

Feedback

TBD

Contact

Email me with any questions: bhanu@collab.place

About

A search application for textbooks.

Apache License 2.0

Languages

Language:Python 100.0%