ddlBoJack / emotion2vec

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The WeChat group QR code has expired again

splinter21 opened this issue · comments

其实我是有一个需求,是长音频需要切片算情感分类概率,比如每5s得到一个 ,但是目前pipeline api封装得太死了,不支持这么操作,只支持全局平均算出一个。如果pipeline接口能额外输入一个切片长度,得到的概率向量多一个时间维度,就好了

You can cut the audio into 5s per chunk before forwarding the model. This is a better way in my opinion.

QR code updated