ZZ-Image-Caption

Image captioner CLI using BLIP and BLIP2 models

Installation

pip install zz-image-caption

You may need to install pytorch separately depending on your system to use CUDA (default to use CPU if not available).

Print caption for an image to the console

caption image.jpg

Rename images in a directory with their captions

caption images/ -o filename

Write metadata for images in a directory with their captions

caption images/ -o metadata

Print caption for an image to the console using the BLIP2 model

caption image.jpg --blip2

The following table lists all the command-line arguments available with descriptions and additional details:

Argument	Type	Choices	Default	Description
`-v`, `--version`	flag			Display the version of the tool.
`input`	string			Path to the input image file or directory.
`-o`, `--output`	string	text, json, metadata, filename		Specify the output type.
`-a`, `--append`	string			Append string to caption output.
`-t`, `--token`	integer		32	Max token length for captioning.
`-b`, `--batch`	integer		1	Batch size for captioning.
`-p`, `--prompt`	string			Prompt for captioning.
`--temp`, `--temperature`	float		1.0	Temperature for captioning.
`--seed`	integer			Seed for reproducibility.
`--large`	flag			Use the large model for captioning.
`--cpu`	flag			Use CPU instead of GPU (not recommended).
`--blip2`	flag			Use Blip2 model for captioning.
`--verbose`	flag			Print verbose output.
`--debug`	flag			Print debug output.

caption --help