rickt / sirdavid

David Attenborough Scene Commentary Creator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sirdavid

Have you ever wanted David Attenborough's voice to describe the contents of a webcam photo in pseudo-real time over a secure websocket? Why not indeed!

Live demo up at https://sirdavid.rickt.dev

index.html:

  • takes a snapshot of the browser's webcam
  • uploads it over secure websocket to a mini backend

main.py:

  • sets up a secure websocket listener
  • looks for a .png sent over the websocket
  • if an image arrives, saves it locally and in a GCP bucket
  • decodes the image, has it described by OpenAI's gpt-4o model in a snarky David Attenborough manner
  • generates audio file using a custom ElevenLabs David Attenborough voice i created
  • sends the URL of the audio file to the browser
  • plays the audio in the browser

Environment Variables:

You'll need to set:

  • ELEVENLABS_API_KEY api key for ElevenLabs
  • OPENAI_API_KEY api key for OpenAI
  • SIRDAVID_APIGW URI pointing to (in this case) my Cloudflare AI Gateway
  • SIRDAVID_BUCKET name of GCP bucket to store images, text analyses & audio files
  • SIRDAVID_PORT port for the websocket listener
  • SIRDAVID_SERVICEACCOUNT_JSON path to service account JSON file for auth to GCP bucket
  • SIRDAVID_SSL_CERT path to the SSL certificate for the secure websocket
  • SIRDAVID_SSL_KEY path to the SSL key for the secure websocket certificate
  • SIRDAVID_VOICE string containing the voice ID of the ElevenLabs voice

Notes

  • You will have to make your own ElevenLabs David Attenborough voice, as I can't share mine

About

David Attenborough Scene Commentary Creator


Languages

Language:Python 61.9%Language:HTML 36.1%Language:Shell 2.1%