awslabs / rhubarb

A Python framework for multi-modal document understanding with Amazon Bedrock

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rhubarb

Amazon Bedrock License made-with-python Python 3.11 Ruff

Rhubarb

Rhubarb is a light-weight Python framework that makes it easy to build document understanding applications using Multi-modal Large Language Models (LLMs) and Embedding models. Rhubarb is created from the ground up to work with Amazon Bedrock and Anthropic Claude V3 Multi-modal Language Models, and Amazon Titan Multi-modal Embedding model.

What can I do with Rhubarb?

Visit Rhubarb documentation.

Rhubarb can do multiple document processing tasks such as

  • βœ… Document Q&A
  • βœ… Streaming chat with documents (Q&A)
  • βœ… Document Summarization
    • πŸš€ Page level summaries
    • πŸš€ Full summaries
    • πŸš€ Summaries of specific pages
    • πŸš€ Streaming Summaries
  • βœ… Structured data extraction
  • βœ… Named entity recognition (NER)
    • πŸš€ With 50 built-in common entities
  • βœ… PII recognition with built-in entities
  • βœ… Figure and image understanding from documents
    • πŸš€ Explain charts, graphs, and figures
    • πŸš€ Perform table reasoning (as figures)
  • βœ… Document Classification with vector sampling using multi-modal embedding models
  • βœ… Logs token usage to help keep track of costs

Rhubarb comes with built-in system prompts that makes it easy to use it for a number of different document understanding use-cases. You can customize Rhubarb by passing in your own system prompts. It supports exact JSON schema based output generation which makes it easy to integrate into downstream applications.

  • Supports PDF, TIFF, PNG, JPG files (support for Word, Excel, PowerPoint, CSV, Webp, eml files coming soon)
  • Performs document to image conversion internally to work with the multi-modal models
  • Works on local files or files stored in S3
  • Supports specifying page numbers for multi-page documents
  • Supports chat-history based chat for documents
  • Supports streaming and non-streaming mode

Installation

Start by installing Rhubarb using pip.

pip install pyrhubarb

Usage

Create a boto3 session.

import boto3
session = boto3.Session()

Call Rhubarb

Local file

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="./path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

With file in Amazon S3

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="s3://path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

For more usage examples see cookbooks.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

About

A Python framework for multi-modal document understanding with Amazon Bedrock

License:Apache License 2.0


Languages

Language:Python 57.9%Language:Jupyter Notebook 41.5%Language:Dockerfile 0.2%Language:Batchfile 0.2%Language:Shell 0.1%