Llava-WEBUI-Caption: Enhance Your Image Captioning Experience

Llava-WEBUI-Caption is a user-friendly Gradio-based interface designed to streamline the process of captioning image datasets using the LLaVA Model. Whether you're dealing with PNG, JPG, JPEG, or WEBP formats, this tool simplifies the task of generating and managing image captions.

🌟 Features

Image Directory Captioning: Automatically create text captions for images in various formats, saving them alongside the original files.
Download & Caption: Just add image URLs to an "images.json" file in your source folder and let the script handle the downloading and captioning.
Caption Management: Choose to skip, replace, or append to existing captions, giving you full control over your captioning process.

⚙️ Installation

Get the 1.1.3 release from the LLAVA repository
Follow the instructions provided in the LLaVA repo to set up.
Add the Llava-WEBUI-Caption script to the downloaded folder and run it to start captioning your images.

About

A simple WEBUI based on GRADIO for making use of the LLAVA Model for captioning datasets.

Languages

Language:Python 100.0%