asteroid-team / asteroid

The PyTorch-based audio source separation toolkit for researchers

Home Page:https://asteroid-team.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mossFormer2 & sepTDA models

jeromew opened this issue Β· comments

πŸš€ Feature

I suggest the addition of the mossFormer2 and sepTDA models

Motivation

The 2 models seem to be improving the SOTA on the speaker separation task.
cf https://paperswithcode.com/sota/speech-separation-on-wsj0-2mix

sepTDA :

mossformer2:

What you'd like

A implementation of the models in asteroid with a running pretrained model for inference

Alternatives

I managed to have mossformer2 inference work via https://modelscope.cn/models/iic/speech_mossformer2_separation_temporal_8k/summary

Additional context

I try to separate sources with an unknown number of speakers on a difficult audio track (opera music + many speakers with a lot of overlapping)

Hello,

Thank you for the issue. Do you want to contribute these models ? We'll welcome them for sure !

Hello, thanks for your response.

I am afraid I am too far from this field at the moment to be able to contribute models. I was just playing around with source separation models to try and solve a CTF puzzle involving a difficult to parse audio mix. I will join the slack channel if things change.

I am closing this issue as I am sure you are not missing models to integrate into asteroid and that those 2 will re-appear if they are key to the field. In the meantime you will have one less issue in github !