beswift / Whisper2me

A set of expermintal scripts for ASR and generative AI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A collection of experimental scripts for playing with "generative ai"

use whisper2me.py to speak or text2me.py to message your computer directly and have it respond to you with:

  • Answers to questions
  • Interesting conversation
  • Text completion
  • code answers and examples
  • Image generation
  • Music Generation

wip... more to come

Concepts, code and inspiration built on the shoulders of these giants:

Whisper from openai was used for STT

'Riffusion' [created by Seth Forsgren and Hayk Martiros ](https://huggingface.co/riffusion/riffusion-model-v1

You can use the Riffusion model directly, or try the Riffusion web app(https://www.riffusion.com/).

'Riffusion demo' https://huggingface.co/spaces/anzorq/riffusion-demo

The Riffusion model was created by fine-tuning the Stable-Diffusion-v1-5 checkpoint.

Image generation is done with either OpenAI Dalle (if you have an api key) or Stable Diffusion

OpenAI Dalle: https://openai.com/blog/dall-e/

Stable Diffusion: https://huggingface.co/blog/stable-diffusion

'Whisper Mic' https://github.com/mallorbc/whisper_mic

'Coqui.ai TTS' https://tts.readthedocs.io/en/latest/index.html

About

A set of expermintal scripts for ASR and generative AI


Languages

Language:Python 100.0%