There are 18 repositories under gpt-4-vision topic.
🤯 Lobe Chat - an open-source, modern-design LLMs/AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Bedrock / Azure / Mistral / Perplexity ), Multi-Modals (Vision/TTS) and plugin system. One-click FREE deployment of your private ChatGPT chat application.
Enhanced ChatGPT Clone: Features OpenAI, Assistants API, Azure, Groq, GPT-4 Vision, Mistral, Bing, Anthropic, OpenRouter, Vertex AI, Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development
Effective prompting for Large Multimodal Models like GPT-4 Vision, LLaVA or CogVLM. 🔥
免费的ChatGPT API的安卓语音助手,可用音量键唤起并进行语音交流,支持联网、Vision拍照识图、提问模板等功能 | A free ChatGPT API voice assistant for Android, activated via volume keys for voice interaction, supporting features such as network connectivity, Vision photo recognition, and question templates.
High quality resources & applications for LLMs, multi-modal models and VectorDBs
Desktop AI Assistant powered by GPT-4, GPT-4 Vision, GPT-3.5, DALL-E 3, Langchain, Llama-index, chat, vision, voice control, image generation and analysis, autonomous agents, code and command execution, file upload and download, speech synthesis and recognition, access to Web, memory, prompt presets, plugins, assistants & more. Linux, Windows, Mac.
AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.
The most advanced Web UI for AI chat
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Convert a screenshot to a working Flutter app.
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities or tools/functions.
Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦
GPT-4 Vision Chatbot examples
A multi-modal chat application enabling users to create custom agents, and integrate with local LLMs (Local Language Models), as well as OpenAI models.
Shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS. Features LocalAI, Ollama, Gemini, and Mistral integration.
GPT 4 Turbo Vision with Chainlit
This sample project integrates OpenAI's GPT-4 Vision, with advanced image recognition capabilities, and DALL·E 3, the state-of-the-art image generation model, with the Chat completions API. This powerful combination allows for simultaneous image creation and analysis.
AI Telegram Bot, ChatGPT, Dalle2, Whisper, GPT-4 Vision, Stability AI
Using Azure OpenAI deployment of GPT-4 Turbo with Vision to analyse out-of-stock situation in a fictitious retail shop.
A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.
Language instructions to mycobot using GPT-4V
Object detection using Open AI Vision Model
A simple chat app with vision using Next.js, Vercel AI SDK, and GPT-4V.
A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insights and detailed breakdowns in an interactive chat interface.
A customizable GPT in a single page, using OpenAI models text-embedding-ada-002, tts-1, whisper-1, dall-e-3, and gpt-4-vision-preview
Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).
Use text or image prompts to generate components and apps built with React.
Chat with your images using GPT-4 Vision!
全网最低价的OpenAI ChatGPT-4-32K、ChatGPT-3.5 API 最高低于官方价42倍。The lowest-priced OpenAI ChatGPT-4-32K and ChatGPT-3.5 APIs on the entire network are 42 times lower than the official price.
Using OpenAI's GPT-4 Vision API, this tool offers an interactive way to analyze and understand your screenshots. Capture any part of your screen and engage in a dialogue with ChatGPT to uncover detailed insights, ask follow-up questions, and explore visual data in a user-friendly format.