DEMO:

VoiceVox.mp4

README

This script records audio from your microphone and sends it to the Whisper AI to transcribe it into text. Then it uses DeepL to translate the text from English to Japanese, and finally, it calls the VoiceVox Engine to generate speech from the translated Japanese text and plays it back.

Requirements

Python 3
Whisper AI installed on your machine, pip install git+https://github.com/openai/whisper.git
DeepL API key
VoiceVox Engine Docker image running in a Docker container. You can pull the image by running docker pull voicevox/voicevox_engine:cpu-ubuntu20.04-latest and start the container by running docker run --rm -it -p '127.0.0.1:50021:50021' voicevox/voicevox_engine:cpu-ubuntu20.04-latest command.

Usage

Clone this repository and navigate to the directory.
Install the required Python packages by running pip install -r requirements.txt command.
Start the VoiceVox Engine Docker container by running docker run --rm -it -p '127.0.0.1:50021:50021' voicevox/voicevox_engine:cpu-ubuntu20.04-latest command.
Run the script by running python main.py command.
Press the r key on your keyboard to start and stop recording. When you stop recording, the script will transcribe the audio, translate the text, and generate speech in Japanese from the translated text.

License

This project is licensed under the MIT License. See the LICENSE file for details.

eminbayrak / voicevox_client

DEMO:

README

Requirements

Usage

License

About

Languages