Using streaming + onTranscribe (custom server) together?
sgrove opened this issue · comments
Very impressed by this project, thank you so much for it!
Is there some way to be able to stream the audio to a server endpoint (as in the examples) but also have it iteratively return results? Right now it seems like if streaming: true
is set, it will only hit the whisper api directly from the frontend (e.g. https://api.openai.com/v1/audio/transcriptions
).
That means there's quite a long pause at the end of recording to getting the result (since ffmpeg has to run at the end, and then upload quite a large file before getting the transcription). I'm curious if there's a way to avoid that with the current design?
Hello @sgrove
- if you want to send streaming audio to your own server, you can use onDataAvailable
- when you pass streaming = true, onDataAvailable will be called in interval based on timeSlice.
const streamToServer = (blob) => {
// send chunk of audio to your server
// the implementation will be on your own, you can check the source code at onDataAvailable to see more
}
const { transcript } = useWhisper({ streaming: true, onDataAvailable: streamToServer })
-
ffmpeg only convert if you pass removeSilence = true
-
in case of streaming = true, ffmpeg won't do anything
-
for streaming I tried to send chunk of audio to be transcribe one by one but the translation is not good, so currently it concat audio blobs in succession and send to Whisper.
-
I am still finding a better way to do streaming, if you got better idea it is very welcome.
@chengsokdara Great project, Is it possible for you to create an example in nextjs of this working whilst only exposing the open api keys to the server?
@haluvibe yes I will add that later when this is a bit more stable, currently trying to make it truly cross-browser.
Hello @chengsokdara ,
Thanks for this best lib!
I have already setup my custom server that gives me the response after the user stop speaking the request is made and I have the response.
So the user have to wait for it.
I can see your comment to add onDataAvailable
to stream the response.
But in your code at
Line 481 in 139a22f
you set the transcript, so if we add our own we just have to return the text response from our custom server?
If you have a template or snipper what we exaclty can do to have streaming + onTranscribe (custom server) it will be great for us.
Thanks