Image captioner CLI using BLIP and BLIP2 models
- Python 3.10 or higher
pip install zz-image-caption
You may need to install pytorch separately depending on your system to use CUDA (default to use CPU if not available).
Print caption for an image to the console
caption image.jpg
Rename images in a directory with their captions
caption images/ -o filename
Write metadata for images in a directory with their captions
caption images/ -o metadata
Print caption for an image to the console using the BLIP2 model
caption image.jpg --blip2
The following table lists all the command-line arguments available with descriptions and additional details:
Argument | Type | Choices | Default | Description |
---|---|---|---|---|
-v , --version |
flag | Display the version of the tool. | ||
input |
string | Path to the input image file or directory. | ||
-o , --output |
string | text, json, metadata, filename | Specify the output type. | |
-a , --append |
string | Append string to caption output. | ||
-t , --token |
integer | 32 | Max token length for captioning. | |
-b , --batch |
integer | 1 | Batch size for captioning. | |
-p , --prompt |
string | Prompt for captioning. | ||
--temp , --temperature |
float | 1.0 | Temperature for captioning. | |
--seed |
integer | Seed for reproducibility. | ||
--large |
flag | Use the large model for captioning. | ||
--cpu |
flag | Use CPU instead of GPU (not recommended). | ||
--blip2 |
flag | Use Blip2 model for captioning. | ||
--verbose |
flag | Print verbose output. | ||
--debug |
flag | Print debug output. |
caption --help