rhasspy / larynx

End to end text to speech system using gruut and onnx

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot redirect audio output to file with --raw-stream

Ch41r05 opened this issue · comments

When I try to redirect larynx output to a .wav file from the shell, the file produced is corrupted, when I try to play the output with the same command adding | aplay syntax it plays flawlessly.
larynx -v cmu_jmk -q high --raw-stream < /mnt/hgfs/HostSharedFolder/text/text.txt > test.wav
Am I missing something?
Following the informations given in the wiki the command larynx -v cmu_jmk -q high "Test text." > test.wav works as expected, so it seems there's an issue with the --raw-stream specifier and output redirection, could you please help?

Did you solve this? I'm trying to stream the audio file and save it contemporaneously, and I can only do that if I launch larynx twice.

@framilano Yes, I used python to save it to a file. You must import wavfile from larynx and then call the save method from it. The problem I'm encountering with this method is I find it hard to understand what value the bitrate should have.

The raw stream doesn't contain WAV header info since I can't know the duration beforehand. By default, the output should be 22050 Hz 16-bit mono PCM.

Here's an example using sox to save to a WAV file:

larynx --raw-stream < test.txt | sox -t raw -r 22050 -b 16 -c 1 -e signed-integer - -t wav test.wav

Fantastic @synesthesiam, I'll try as soon as I can with that output configuration in python :-) If I can make it work I'll try to write a test to submit so there's atleast an example of parsing some text.

Hi @synesthesiam ,

I tried to use the configuration you sent me with python, but I'm having a hard time setting the 16-bit mono PCM (I apologize but I'm not familiar with audio theory) would you please share an example?

Thanks a lot