jorge-menjivar / unsaged

Open source chat kit engineered for seamless interaction with AI models.

Home Page:https://unsaged.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature] Add GPT Vision Models

sebiweise opened this issue · comments

Any opinions on what this feature should look like?

I imagine that we agree that if the gpt-4-vision model is selected, we show an "upload images" icon on the left in the message bar.

But beyond that?

The API accepts both image URLs and base64 encoded images. Should we present this choice to the user?

The API permits that the image's detail be set to low/high/auto. Should we present this choice to the user?

If the user uploads an image (rather than provides a URL), should we use supabase storage (S3-equivalent) to hold onto it? The gpt-4-vision docs recommend ...

For long running conversations, we suggest passing images via URL's instead of base64. The latency of the model can also be improved by downsizing your images ahead of time to be less than the maximum size they are expected them to be. For low res mode, we expect a 512px x 512px image. For high res mode, the short side of the image should be less than 768px and the long side should be less than 2,000px.

Just glancing at supabase storage, it looks like it could offer both URLs and resizing, which would be advantageous in long running conversations.