non english speech transformed to weird language

Question

non english speech transformed to weird language

BahzBeih opened this issue 7 months ago · comments

BahzBeih commented 7 months ago

Peace, the non english speech transformed to weird language, i think it only work with english speech right now.

Karen Palacio · Answer 1 · Sat Dec 16 2023 12:49:40 GMT+0800 (China Standard Time)

Experienced the same with spanish audio. Sounds kinda german after denoising it.

BahzBeih · Answer 2 · Sat Dec 16 2023 13:04:02 GMT+0800 (China Standard Time)

Experienced the same with spanish audio. Sounds kinda german after denoising it.

i faced the problem with Arabic language, and i have the same problem with adobe audio enhance online tool.

Zhe Niu · Answer 3 · Mon Dec 18 2023 14:13:46 GMT+0800 (China Standard Time)

The current model is mainly trained on English datasets and may not work as well with other languages. We hope to expand its language support in the future, and contributions are always welcome.

peili · Answer 4 · Mon Dec 18 2023 16:57:17 GMT+0800 (China Standard Time)

@enhuiz Are those English datasets available anywhere?

wolfgang-wp · Answer 5 · Tue Dec 19 2023 06:17:33 GMT+0800 (China Standard Time)

@enhuiz I'd like to help and contribute with other language models as well. Can you provide datasets as a reference?

anrice · Answer 6 · Wed Dec 20 2023 03:27:40 GMT+0800 (China Standard Time)

Hello @enhuiz and @ZohaibAhmed ,

I've been following the discussion on the challenges faced with non-English audio processing using the resemble-enhance tool. Like others here, I attempted to train a model using German language samples. However, without adequate reference datasets or examples, the training process did not yield a reasonable model (pt).

The model's performance with German language samples was suboptimal, leading to outcomes that were not practically usable. This experience aligns with what others have reported regarding Spanish and Arabic audio processing. It seems evident that the current model's training and optimization are heavily skewed towards English datasets.

I am keen on contributing to the enhancement of the tool for better performance with non-English languages, particularly German. Any guidance on accessing suitable datasets or reference models that have been effectively trained on non-English languages would be highly beneficial. The availability of such resources would greatly aid in developing more robust and language-inclusive models.

Thank you for your efforts in creating this tool, and I look forward to any possibility of collaboration or contribution towards its improvement in handling diverse languages.

Xylphe · Answer 7 · Thu Dec 28 2023 00:48:06 GMT+0800 (China Standard Time)

Hello !
Same problem there with french language.
Are you familiar with Mozilla's Common Voice initiative ?
You could use it to train the model with other languages :)

4lvrz · Answer 8 · Fri Feb 23 2024 19:04:17 GMT+0800 (China Standard Time)

Hello ! Same problem there with french language. Are you familiar with Mozilla's Common Voice initiative ? You could use it to train the model with other languages :)

Nice solution, it could do the trick!. However Common Voice is poorly supervised, and it might be a problem using deteriored samples for training enhance stage. Does anyone know if high audio quality is essential for training enhance system?

Stan Kirdey · Answer 9 · Sat Mar 16 2024 04:41:23 GMT+0800 (China Standard Time)

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project https://github.com/ruizhecao96/CMGAN which also works very well on non-english languages.

Robin Bozan · Answer 10 · Fri Apr 26 2024 19:39:02 GMT+0800 (China Standard Time)

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project ruizhecao96/CMGAN which also works very well on non-english languages.

The demos sound great in the repo. But do you know if there's an easier tool to use this? For example, a CLI tool where I can just input a MP3 and it outputs an enhanced MP3?

kanjieater · Answer 11 · Fri Jun 28 2024 12:09:39 GMT+0800 (China Standard Time)

I was hoping to use this for Japanese, but seems like I'll need to hold out.