asteroid-team / asteroid

The PyTorch-based audio source separation toolkit for researchers

Home Page:https://asteroid-team.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PIT Loss for multichannel audio for speech separation

SutirthaChakraborty opened this issue · comments

I have a 4 channel audio generated by my model (left,right,side,mid).
I can I apply PIT loss into it
The shape of the tensors are
Speaker one : [batch,channel,time]
Speaker two: [batch,channel,time]

If I need to apply PIT, how should I apply : [batch,channel,speaker,time] ?

if I convert it to mono, or take the mean, the model is unable to learn 4 channels properly.

I think the channel should be first, in order to build the permutation matrix of dimension (batch, speaker, speaker) with broadcasting.