Stereo audio
matthiasanderer opened this issue · comments
Would this also work for stereo (i.e. 2 channel) audio?
I wonder how to best adapt the code to this. (Especially that the timm parts have been trimmed down from 3 channels to 1 channel anyway)
Hi,
I think it is doable, even with our pretrained model.
- These are where we select the first channel, you need to change these.
Line 112 in a1a3eec
Line 116 in a1a3eec
- You also need to work on fbank extraction to make sure the output is two channel.
Line 126 in a1a3eec
This includes a new dim which were squeezed for single-channel fbanks. So you also need to take care of the input pre-processing at the model side
ssast/src/models/ast_models.py
Line 436 in a1a3eec
Note we did this for multiple forward pass and above is just one of them.
- Then you need to change the model size to take two channels instead of one.
ssast/src/models/ast_models.py
Line 130 in a1a3eec
In short, it needs some (careful) changes of the code, but is doable. I am not sure about your purpose, but it will be easier if you can add the two channels as a single channel.
-Yuan