resemble-ai / resemble-enhance

AI powered speech denoising and enhancement

Home Page:https://huggingface.co/spaces/ResembleAI/resemble-enhance

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

non english speech transformed to weird language

BahzBeih opened this issue · comments

Peace, the non english speech transformed to weird language, i think it only work with english speech right now.

Experienced the same with spanish audio. Sounds kinda german after denoising it.

Experienced the same with spanish audio. Sounds kinda german after denoising it.

i faced the problem with Arabic language, and i have the same problem with adobe audio enhance online tool.

The current model is mainly trained on English datasets and may not work as well with other languages. We hope to expand its language support in the future, and contributions are always welcome.

@enhuiz Are those English datasets available anywhere?

@enhuiz I'd like to help and contribute with other language models as well. Can you provide datasets as a reference?

Hello @enhuiz and @ZohaibAhmed ,

I've been following the discussion on the challenges faced with non-English audio processing using the resemble-enhance tool. Like others here, I attempted to train a model using German language samples. However, without adequate reference datasets or examples, the training process did not yield a reasonable model (pt).

The model's performance with German language samples was suboptimal, leading to outcomes that were not practically usable. This experience aligns with what others have reported regarding Spanish and Arabic audio processing. It seems evident that the current model's training and optimization are heavily skewed towards English datasets.

I am keen on contributing to the enhancement of the tool for better performance with non-English languages, particularly German. Any guidance on accessing suitable datasets or reference models that have been effectively trained on non-English languages would be highly beneficial. The availability of such resources would greatly aid in developing more robust and language-inclusive models.

Thank you for your efforts in creating this tool, and I look forward to any possibility of collaboration or contribution towards its improvement in handling diverse languages.

Hello !
Same problem there with french language.
Are you familiar with Mozilla's Common Voice initiative ?
You could use it to train the model with other languages :)

Hello ! Same problem there with french language. Are you familiar with Mozilla's Common Voice initiative ? You could use it to train the model with other languages :)

Nice solution, it could do the trick!. However Common Voice is poorly supervised, and it might be a problem using deteriored samples for training enhance stage. Does anyone know if high audio quality is essential for training enhance system?

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project https://github.com/ruizhecao96/CMGAN which also works very well on non-english languages.

I really like this tool for denoising, but enhancement doesn't really work on most of the samples. I found enhancement and denoising is done better in another open source project ruizhecao96/CMGAN which also works very well on non-english languages.

The demos sound great in the repo. But do you know if there's an easier tool to use this? For example, a CLI tool where I can just input a MP3 and it outputs an enhanced MP3?

I was hoping to use this for Japanese, but seems like I'll need to hold out.