Suitability of ts2vec Framework for Lengthy Mono Audio Waveforms

Question

Suitability of ts2vec Framework for Lengthy Mono Audio Waveforms

Willtl opened this issue a year ago · comments

In the context of mono audio classification on the raw waveforms, can the ts2vec framework be utilized for frames of 500ms with a sample rate of 22050 Hz? This would result in univariate time series of shape (1, 11025). Is 11025 considered too long for the ts2vec framework?

Have you tested the ts2vec framework on longer time series datasets, and is there any existing benchmark for evaluating its performance on such lengthy inputs?

Bharat Sharma · Answer 1 · Fri Jun 16 2023 21:47:56 GMT+0800 (China Standard Time)

I recently used this model on time series data sampled at 1KHz with a shape of (1, 15000). The max performance I have been able to get out of this is 0.56 AUC. The model is splitting any time series over the size of 3000 so I am guessing there is a performance drop for time series data longer than 3000. I might be incorrect though. But I haven't been able to get good performance.