jc1518 / image-reader

A project to explore various foundation models that have vision capabilities in Amazon Bedrock.

Image Reader

Introduction

A project to explore various foundation models that have vision capabilities in Amazon Bedrock.

Image Reader uses Claude 3 multimodal models to interpret images or transcribe text in the images.
Image Finder uses Titan multimodal embedding model to find the similar images by text or image.
Image Library uses ChromaDB as the vector database for storing images embeddings.
Image Generator uses Titan image generator model to generate images.

Requirements

Use Python 3.11+, and install dependencies: pip install -r requirements.txt.
Default bedrock region is us-west-2, change the value of BEDROCK_REGION in constant.py accordingly if you use other region.
Request access to Claude 3 models and Titan models in Bedrock if you have not done that.

Use locally

Setup AWS credentials, then run cd image-reader; streamlit run Home.py

Deploy to AWS

Setup AWS credentials, then run

Customize the config.yaml
Install dependencies cd cdk; npm install
Deploy npx cdk deploy --require-approval never

Demo

Blog

About

A project to explore various foundation models that have vision capabilities in Amazon Bedrock.

Languages

Language:Jupyter Notebook 96.1%Language:Python 2.3%Language:TypeScript 1.4%Language:JavaScript 0.1%Language:Dockerfile 0.0%