AgilistTim / gpt-video

A reproduction of the Gemini demo using GPT-vision.

Home Page:https://gpt-video.vercel.app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPT Video - Reproducing the Gemini demo using GPT 4 Vision

Screenshot of the App

🌌 Overview

After seeing the 'gemini' video, I asked myself: Could the 'gemini' experience showcased by Google be more than just a scripted demo? This project is a fun experiment to explore the feasibility of real-time AI interactions similar to those portrayed in 'gemini'.

See detailed explanation in this article

✨ Demo

https://gpt-video-jidefr.vercel.app

You'll need an OpenAI API key. Remember to delete the API key after using it sor safety.

πŸ›  Stack

  • Next.js with App Router.
  • Vercel AI npm module.
  • OpenAI's Whisper and GPT APIs.

πŸš€ Getting Started

You can provide the `OPENAI_API_KEY`` environment variable or let the user provide its own API key in the UI.

First, run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

πŸ”§ Constants

Some constants are fixed at the top of /src/app/page.js. You may want to tweak these :

const INTERVAL = 250;
const IMAGE_WIDTH = 512;
const IMAGE_QUALITY = 0.6;
const COLUMNS = 4;
const MAX_SCREENSHOTS = 60;
const SILENCE_DURATION = 2500;
const SILENT_THRESHOLD = -30;

About

A reproduction of the Gemini demo using GPT-vision.

https://gpt-video.vercel.app


Languages

Language:JavaScript 80.6%Language:TypeScript 16.8%Language:CSS 2.5%