jnnnnn / mubbles

A proof-of-concept GUI dictation app based on Whisper.cpp-rs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mubbles

An Egui wrapper around Whisper.cpp, the OpenAI speech-to-text model.

Can record from microphone and speakers.

Usage

Obtain a ggml-format Whisper model. The easiest way is probably to download one from https://huggingface.co/ggerganov/whisper.cpp/tree/main, and also see more documentation about models [here].

The app expects to find a model at ./small.bin or ~/.cache/whisper/small.bin.

I have found that the small model is a good balance between performance and quality.

  • If you're on a M1+ Mac or have a beefy Cuda card, maybe medium is better.
  • If you're on cpu-only, use tiny.

If necessary, you can also get a pytorch-format model from huggingface and then convert it to ggml using this script.

Once you have a model file, run the app:

cargo run

You may need to remove the cuda feature from the whisper-rs dependency if you don't have a CUDA-capable GPU (and the cuda tookit installed). In this case, I would recommend using the Tiny model.

Regular usage

Install the app into the default cargo bin directory (probably ~/.cargo/bin):

cargo install --path .

Screenshot

mubbles screenshot.png

About

A proof-of-concept GUI dictation app based on Whisper.cpp-rs

License:MIT License


Languages

Language:Rust 65.3%Language:Python 25.5%Language:HTML 7.5%Language:JavaScript 1.0%Language:Shell 0.7%