Sampling rate mismatch when recording audio with mic in browser in Inference tab

Question

Sampling rate mismatch when recording audio with mic in browser in Inference tab

domesticatedviking opened this issue 3 months ago · comments

I had a 44100Hz dataset which I downsampled to 40000Hz prior to training. (I don't understand why 44100Hz datasets aren't supported, but that's another issue)

Training completed successfully.

When I tested the model using "inference" tab with audio recorded in-browser, both speech rate and pitch were much too fast.
Suspecting the sampling rate was an issue, I recorded my test audio in Audacity at 40000Hz, uploaded it, converted it, and found that the output was normal.

Suggest that mic audio collected in the Inference tab be changed to the model sampling rate prior to voice conversion.

Pascal Aznar · Answer 1 · Sun May 05 2024 07:05:21 GMT+0800 (China Standard Time)

We can't control that, that's what Gradio offers us.