Speech after a particular interval

Question

Speech after a particular interval

IrfanAli17899 opened this issue 7 months ago · comments

Hi, first of all thank you so much for this awesome package, its working great for me, the only thing that i am wondering if i can get the speech after each 5 seconds instead of pause, because right now if user continuously speaks so it doesn't give us results and this shows a kind of latency in the UI. please help thanks in advance.

8trees · Answer 1 · Tue Dec 19 2023 19:36:37 GMT+0800 (China Standard Time)

Thanks for your proposal! Your idea is nice, but it has a problem that the audio can be splitted in the middle of a word.
I'm not sure it should be implemented.

When I wanted to make each speech segment shorter, I tried reducing redemptionFrames and increasing negativeSpeechThreshold.

Ricky Samore · Answer 2 · Wed Dec 20 2023 01:36:29 GMT+0800 (China Standard Time)

Hi @IrfanAli17899, I'm glad this package has been useful for you. So in other words, throughout a continuous speech period, you would like to have a callback that runs on a regular 5 second interval and takes as an argument the current raw audio of the speech segment? Can I ask what kind of UI updates you are referring to? I would like to understand the use case better.

Irfan Ali · Answer 3 · Wed Dec 20 2023 01:43:05 GMT+0800 (China Standard Time)

Hi @ricky0123 yes you are right, actually the audio i am getting from your package, i am feeding that audio to gpt for transcription and translation and then i show those results on the frontend, i am trying to make a real time translator, the problem is the library doesn't provide audio segments untill user don't stop speaking, i want a smooth audio segment on a regular interval so that if user contineously speaks without stopping then i could still show the transcription and translation results. do you get it? let me know if you need more explanation of the use case, thanks much.

Ricky Samore · Answer 4 · Wed Dec 20 2023 02:03:32 GMT+0800 (China Standard Time)

Hi @IrfanAli17899, thanks for the clarification. Have you considered streaming audio from the browser to your server and doing all of the audio processing there, instead of using this package?

Potentially what we could do is provide a method on the vad object that allows you to get the current audio segment. That would allow you to experiment by creating a timer that queries the current audio and sends it to your server.

Irfan Ali · Answer 5 · Wed Dec 20 2023 20:51:05 GMT+0800 (China Standard Time)

yeah i tried the browser media recorder api to stream audio to the server, but as the first chunk is playable because it has all the necessary headers rest of the chunks are not so i had to merge all the chunks on the backend and then crop the last 5 seconds for the transcription so it was a very lengthy hectic solution that is why i tried your package.

yes it will be very helpful if there is a prop which takes the callback function and also another prop for interval and it can provide me chunks but each chunk should be playable itself i guess, then it will be useful for me. let me know what do you think, thanks @ricky0123

Ricky Samore · Answer 6 · Thu Dec 21 2023 04:31:53 GMT+0800 (China Standard Time)

Hi @IrfanAli17899 what I'm saying is that we probably won't add a callback that runs on an interval, but I would be open to adding a method for you to get the raw audio at any given time, so you could do something like

myvad = await vad.MicVAD.new(...)
mytimer = createTimer(myIntervalLength, function() {
    const audio = myvad.getCurrentAudio()
    // send audio to server, etc
})

This would be easy to implement and more general. I'm not sure if the method you're describing of sending audio to your server on an interval will work, but this would at least allow you to try it out.

Irfan Ali · Answer 7 · Thu Dec 21 2023 04:55:46 GMT+0800 (China Standard Time)

Yes it will be very helpful, please implement it.

…

On Thu, 21 Dec 2023, 1:32 am Ricky Samore, ***@***.***> wrote: Hi @IrfanAli17899 <https://github.com/IrfanAli17899> what I'm saying is that we probably won't add a callback that runs on an interval, but I would be open to adding a method for you to get the raw audio at any given time, so you could do something like myvad = await vad.MicVAD.new(...) mytimer = createTimer(myIntervalLength, function() { const audio = myvad.getCurrentAudio() // send audio to server, etc }) This would be easy to implement and more general. I'm not sure if the method you're describing of sending audio to your server on an interval will work, but this would at least allow you to try it out. — Reply to this email directly, view it on GitHub <#68 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKDBII6GN4QWR5QFCZ5ERS3YKNDMJAVCNFSM6AAAAABA24QKPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRVGEYDANZWGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Irfan Ali · Answer 8 · Thu Dec 21 2023 14:05:34 GMT+0800 (China Standard Time)

Hi, pardon @ricky0123 will that audio be vad model processed?

Ashwin Maurya · Answer 9 · Sun Apr 14 2024 23:46:46 GMT+0800 (China Standard Time)

Hi @IrfanAli17899 what I'm saying is that we probably won't add a callback that runs on an interval, but I would be open to adding a method for you to get the raw audio at any given time, so you could do something like
myvad = await vad.MicVAD.new(...)
mytimer = createTimer(myIntervalLength, function() {
    const audio = myvad.getCurrentAudio()
    // send audio to server, etc
})
This would be easy to implement and more general. I'm not sure if the method you're describing of sending audio to your server on an interval will work, but this would at least allow you to try it out.

@ricky0123 Hello Ricky, thanks for the great package, is the above feature implemented in the package? I am looking for the same thing to do a real time stream of audio to the server!!

Irfan Ali · Answer 10 · Fri May 10 2024 14:13:44 GMT+0800 (China Standard Time)

to @ricky0123 Hi, is it done?