k7d / picroast

Home Page:https://picroast.app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PicRoast

This repository contains code for PicRoast app.

PicRoast lets you capture a photo and turn into a comedy roast. It uses OpenAI GPT-4 with vision and ElevenLabs text-to-speech APIs.

How It Was Built

PicRoast was built an experiment to explore capabilites of GPT-4, and especially the new vision support, to generate code.

Most of the code was generate using GPT, with combo of text as well as visual instructions created with Whimsical Wireframes.

For visual intructions the flow was pretty straight forward:

  1. Create wireframes in Whimsical
  2. Add some annotations and flows for logic
  3. Select and copy as image (Cmd-Shift-C)
  4. Paste it in ChatGPT and ask it to update code (assuming it was provided to ChatGPT earlier) based on diagram
  5. Iterate and repeat as needed.

Here's some example Whimsical snapshots that were used in the process:

image

All the bitmap images were also generate using new DALL·E 3 support in GPT-4.

Running locally

  1. Set API keys using environoment variables:
export OPENAI_API_KEY=YOUR_API_KEY
export ELEVEN_API_KEY=YOUR_API_KEY
  1. Update voice IDs in speech.ts (these need to be added to your VoiceLab in ElevenLabs)

Run the development server:

npm run dev

Open http://localhost:3000 with your browser to see the result.

About

https://picroast.app


Languages

Language:TypeScript 95.2%Language:CSS 3.9%Language:JavaScript 0.9%