Mozer / talk-llama-fast

Port of OpenAI's Whisper model in C/C++ with xtts and wav2lip

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

the voice (and video) cuts out early and doesn't complete the response- would love the SillyTavern instructions.

tomstur1 opened this issue · comments

XTTS seems to cut out early before response is finished.
set chunks to --wav-chunk-sizes=100,200,300,400,9999
no go.

Sillytavern proper with Koboldcpp.exe and another model with extras enabled without the video encoder (talk-llama-wave2lip.bat) -- no problems with the xtts.

would love full SillyTavern instructions to get this video to work -- don't care much for fast-llama .

You need to run both xtts_wav2lip.bat and my modified silly_extras to make it work with Silly tavern.
Also, don't forget params --stream-to-wavs --call-wav2lip in xtts_wavlip.bat, they are also needed.
Turn off streaming for koboldcpp and for XTTS in SillyTavern. Streaming is not yet supported in ST for wav2lip.

I have tested --wav-chunk-sizes=100,200,300,400,9999 - no problem found.

with SillyTavern, connected to Extra's (silly_extras.bat) and xtts_wav2lip.bat but only for a single conversation (first one) then it freezes until a new chat session is created. odd. I enabled Xtts2 and connected the xtts_wav2lip service to it. I had to disable streaming in Xtts SillyTavern.
But at least I got it to say what I what to say as needed. fun stuff.

{using Koboldcpp )

Weird. Any errors?