| NLP | LLM | Vector | Embeddings DB Search |

Natural Language Processing (NLP), Large Language Models (LLM), and the Power of Vector Embeddings and Databases

| Overview

Embeddings, Vector Databases, and Advanced Search

Converting text into embedding vectors is the first step to any text processing pipeline. As the amount of text gets larger, there is often a need to save these embedding vectors into a dedicated vector index or library, so that developers won't have to recompute the embeddings and the retrieval process is faster. We can then search for documents based on our intended query and pass these relevant documents into a language model (LM) as additional context. We also refer to this context as supplying the LM with "state" or "memory". The LM then generates a response based on the additional context it receives!

In this notebook, we will implement the full workflow of text vectorization, vector search, and question answering workflow. While we use FAISS (vector library) and ChromaDB (vector database), and a Hugging Face model, know that you can easily swap these tools out for your preferred tools or models!

Learning Objectives

Implement the workflow of reading text, converting text to embeddings, saving them to FAISS and ChromaDB
Query for similar documents using FAISS and ChromaDB
Apply a Hugging Face language model for question answering.

About

Natural Language Processing (NLP), Large Language Models (LLM), and the Power of Vector Embeddings and Databases

https://www.kaggle.com/code/yannicksteph/nlp-llm-vector-embeddings-db-search

embeddings llm nlp vectordb

Languages

Language:Jupyter Notebook 100.0%