Multimodal Retrieval Augmented Generation with LLMs

This project harnesses the power of OpenAI's language models (LMs) like GPT-3.5 and GPT-4 for multimodal retrieval and augmented generation. It focuses on analyzing, summarizing, and indexing diverse data types - text, tables, and images. Summaries are stored in a Chroma vectorstore and an InMemoryStore, using OpenAIEmbeddings for sophisticated indexing. The system is designed for advanced information synthesis across formats, priming it for future integrations with multimodal LLMs, including GPT4-V and CLIP, to revolutionize AI-driven content processing and creation.

morrissas / Multimodal-RAG-With-OpenAI

Multimodal Retrieval Augmented Generation with LLMs

About

Languages