RAM stands for Repeat-After-Me. It's a python script that can interpolate your audios with blanks that let you do repeat-after practice for language learning purposes. You can specify the difficulty level of the generated audio. The easier the level, the shorter the segments, the longer the blanks and the lower the playing speed. And it supports a high range of languages thanks to the silero-vad project.
Use the audio from Professor Brian Harvey talking about cheating in class as an example:
Clone this repo:
$ git clone https://github.com/ZhengHe-MD/ram.git
Install Python dependencies:
# step out of ram directory to go through usages.
$ cd ram && pip install -r requirements.txt && cd ..
Suppose the current directory contains this repo, basic usages are:
# generates out.wav in current directory with difficulty level of easy
$ python ram path/to/audio.wav
# specify the difficulty level
$ python ram path/to/audio.wav --level=hard
# specify the output audio file
$ python ram path/to/audio.wav --level=medium --output-audio ./audio-medium.wav
Checkout full options with the following command:
$ python ram -h
Special thanks should be given to the project silero-vad. RAM is nothing without it.
Can I use other audio formats such as mp3?
Audio decoding and encoding are handled by torchaudio, all supported format are listed here. It's worth noting that .wav support is out of the box, and extra efforts are required if you want to use other formats such as .mp3. Please refer to the torchaudio doc. Another way to walk around this is to use ffmpeg, or any other tools you can find on the internet that can handle conversions between different audio formats. Though it's outside the scope.