thibaudcolas / alt-text-benchmark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alt text benchmark

A comparison of image alt text as it exists on the web in 2024, and image descriptions generated with large language models.

Install

This currently requires an OpenAI API key,

fnm use
npm install
cp .env.example .env.sh
# Add your OpenAPI key in .env.sh, then:
source .env.sh

Usage

node run.mjs

After a run, check the contents of the output file, and if it works for you, copy it to the input file ahead of the next run.

Also update the maximum run count in the script, to manage your OpenAPI rate limits.

Prompt

The current prompt has been designed to be simple, with a few tweaks to make descriptions usable as alt text.

Please write a concise description of this image, 200 characters at most. Don’t start your description with generic phrases like 'a photo of' or 'a picture of', just describe. If the image contains legible text, your description has to also transcribe it. If the image is a logo or other text-only visual, only transcribe the text.

Rate limits

As of May 2024, here are the rate limits for gpt-4-vision-preview:

  • 10,000 tokens per minute (you’ll reach that super fast if doing multiple images at the same time, and after some time if you do it sequentially)
  • 80 requests per minute – should be fine if doing images sequentially
  • 500 requests per day - can be a bottleneck

About


Languages

Language:JavaScript 98.5%Language:Shell 1.5%