YuanGongND / ssast

Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to convert fbank features back to audio ?

linmou opened this issue · comments

Given that the fbank feature reconstructed by ssast is not so straight forward, how to transform it into pure audio data for further analysis ?

Hi there,

The goal of reconstruction loss here is just to force the model to learn a good audio representation. We didn't mean to make the model a strong reconstructor. But if you want to convert spectrogram back to waveforms, you will need a vocoder (not included in this repo).

-Yuan

Thanks for your warmly reply.
Any vocoder recommend? I want to inverse fbank features to audios.

Hi there,

I am not familiar with vocoder - you can check the github list: https://github.com/topics/vocoder. Note most of these are for TTS (speech) rather than general audio.

-Yuan