Charmve / gpt-eyes

I GAVE GPT-4 EYES!

agent gpt-4 gpt-4o gpt-4omni gpt4 gpt4v llm llm-inference world-model world-models worldmodel

GPT Eyes

I gave GPT-4 eyes. "眼观六路，耳听八方"

Here’s what I did:

added some data to a vision model
gave the AI camera access
asked it questions about the scene
it identified objects
it searched web for info
used that info to accurately answer

Watch it get 3 questions 100% correct!

This Package Is Sponsorware 💰💰💰

https://github.com/sponsors/Charmve?frequency=one-time&sponsor=Charmve

This repo was only available to my sponsors on GitHub Sponsors until I reached 15 sponsors.

Learn more about Sponsorware at github.com/sponsorware/docs 💰.

Technologies Used

Frontend: React
Image Analysis API: TensorFlow Models - MobileNet
Text Generation API: GPT API

Installation

Clone the repository: git clone https://github.com/Charmve/gpt-eyes.git
Navigate to the project directory: cd gpt-eyes
Install dependencies: npm install

Configuration

Create an account and obtain API keys for TensorFlow Models - MobileNet and GPT API.
Update the configuration file with your API keys:
- TensorFlow Models - MobileNet: /path/to/config.js
- GPT API: /path/to/config.js

Usage

Start the development server: npm start
Open your browser and visit: http://localhost:3000

How it Works

Device camera analyses an image.
The application uses TensorFlow Models - MobileNet API to analyze the image and extract object information.
The application sends the analyzed object information to the GPT API.
The GPT API generates text describing the analyzed object.
The application displays the analyzed image and the generated text.

About

I GAVE GPT-4 EYES!

agent gpt-4 gpt-4o gpt-4omni gpt4 gpt4v llm llm-inference world-model world-models worldmodel

Languages

Language:JavaScript 78.4%Language:CSS 12.5%Language:HTML 9.1%