DiUS / esp-joker

Demo of Speech Recognition (SR) and Text-to-Speech (TTS) for the Espressif ESP-S3-BOX dev kit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

esp-joker

This is a small demo of Speech Recognition (SR) and Text-to-Speech (TTS) for the Espressif ESP-S3-BOX development kit. Contrary to the typical SR/TTS these days, both are performed directly on the microcontroller rather than relying on a cloud server.

When asked, the demo will deliver one of 200+ "dad" jokes of varying amusement level. Each response is generated on the fly to demonstrate the ability to turn any text into speech on demand, rather than having pre-generated it.

The TTS engine used is PicoTTS. It is not as polished as those engines running on far more capable server hardware, but it is a big step up from the older "Talkie" style TTS.

Wake word and commands

As configured, it will respond to the "hey willow" wake word/phrase. Once woken, you can use any one of the following to request a joke:

  • tell me a joke
  • tell me another joke
  • another joke
  • entertain me
  • amuse me

You can also ask it to restart itself with "restart please".

Building and flashing

First, ensure you have either recursively checked out the repo, or explicitly initialise the submodules with git submodule update --init --recursive. The jokes database is sourced from a submodule, and the build will fail if the submodule is not available.

With that done, and assuming the ESP-IDF has been set up and activated, simply:

  idf.py build
  idf.py flash

If you wish to also monitor the console output, use idf.py monitor.

About

Demo of Speech Recognition (SR) and Text-to-Speech (TTS) for the Espressif ESP-S3-BOX dev kit


Languages

Language:C 94.5%Language:CMake 4.0%Language:Shell 1.5%