abhirockzz / langchain-opensearch-rag

Vector databases for generative AI

Home Page:https://community.aws/content/2f5dkpj96MDM6Y9lumYPjZAB8SX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Amazon OpenSearch and LangChain demos

Before you start

  • Make sure you have Python and streamlit installed.
  • Clone this repo

Setup Amazon OpenSearch

Create an Amazon OpenSearch Serverless collection (type Vector search and choose Easy create option) - documentation.

Create an index with below configuration:

Load data into OpenSearch

Download the Amazon 2022 Letter to Shareholders and place it in the same directory.

Create a .env file and provide the following info about your Amazon OpenSearch setup:

opensearch_index_name='<enter name>'
opensearch_url='<enter URL>'
engine='faiss'
vector_field='vector_field'
text_field='text'
metadata_field='metadata'

Make sure you have configured Amazon Bedrock for access from your local machine. Also, you need access to amazon.titan-embed-text-v1 embedding model and anthropic.claude-v2 model in Amazon Bedrock - follow these instructions for details.

Load PDF data:

python3 -m venv myenv
source myenv/bin/activate
pip3 install -r requirements.txt

python3 load.py

Verify data in OpenSearch collection

Run Semantic search app

streamlit run app_semantic_search.py --server.port 8080

You can ask questions, such as:

What is Amazon's doing in the field of generative AI?
What were the key challenges Amazon faced in 2022?
What were some of the important investments and initiatives mentioned in the letter?

Run RAG application

In a different terminal:

source myenv/bin/activate
streamlit run app_rag.py --server.port 8081

You can ask questions, such as:

What is Amazon's doing in the field of generative AI?
What were the key challenges Amazon faced in 2022?
What were some of the important investments and initiatives mentioned in the letter?