jc1518 / image-reader

A project to explore various foundation models that have vision capabilities in Amazon Bedrock.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image Reader

Introduction

A project to explore various foundation models that have vision capabilities in Amazon Bedrock.

  • Image Reader uses Claude 3 multimodal models to interpret images or transcribe text in the images.
  • Image Finder uses Titan multimodal embedding model to find the similar images by text or image.
  • Image Library uses ChromaDB as the vector database for storing images embeddings.
  • Image Generator uses Titan image generator model to generate images.

Requirements

  • Use Python 3.11+, and install dependencies: pip install -r requirements.txt.

  • Default bedrock region is us-west-2, change the value of BEDROCK_REGION in constant.py accordingly if you use other region.

  • Request access to Claude 3 models and Titan models in Bedrock if you have not done that.

Use locally

Setup AWS credentials, then run cd image-reader; streamlit run Home.py

Deploy to AWS

Setup AWS credentials, then run

  • Customize the config.yaml
  • Install dependencies cd cdk; npm install
  • Deploy npx cdk deploy --require-approval never

Demo

Blog

About

A project to explore various foundation models that have vision capabilities in Amazon Bedrock.


Languages

Language:Jupyter Notebook 96.1%Language:Python 2.3%Language:TypeScript 1.4%Language:JavaScript 0.1%Language:Dockerfile 0.0%