wangth2001

Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models

Language:PythonMIT1700

AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. It aims to benchmark the robustness of ASV models in the face of such attacks and offers vital resources for researchers to explore the characteristics of adversarial and replay attacks in this domain.

Language:HTML1000

Vietnam-Celeb

800

VoiceprintRecognition-Pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

Language:PythonApache-2.068000

Multilingual-PR

Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.

Language:Python18400

pytorch-cosine-annealing-with-warmup

Language:PythonMIT42300

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonApache-2.0215100

w2v2-speaker

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053

Language:PythonMIT14100

shadowsocksr

Python port of ShadowsocksR

Language:PythonApache-2.0331400

BaiduPCS-Go

iikira/BaiduPCS-Go原版基础上集成了分享链接/秒传链接转存功能

Language:GoApache-2.0273300

aliyunpan

阿里云盘命令行客户端，支持JavaScript插件，支持同步备份功能。

Language:GoApache-2.0385500

ContextMenuManager

🖱️ 纯粹的Windows右键菜单管理程序

Language:C#GPL-3.01129500

clash_for_windows_pkg_backup

Clash for Windows 最后版本安装包备份

12300

ScriptsForVoxBlink

A repo containing download guidance and corresponding scripts of the VoxBlink dataset.

Language:PythonNOASSERTION1700

HahaPod

The repository for collecting HahaPod dataset.

Language:PythonNOASSERTION300

senet.pytorch

PyTorch implementation of SENet

Language:PythonMIT225700

External-Attention-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language:PythonMIT1109300

TIM-Net_SER

[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".

Language:PythonGPL-3.015300

auditok

An audio/acoustic activity detection and audio segmentation tool

Language:PythonMIT72200

Toroidal-PSDA

A probabilistic scoring backend for length-normalized embeddings.

Language:PythonMIT1000

CREMA-D

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

Language:RNOASSERTION32200

SpeechEmotionRecognition-emodb

Speech Emotion Recognition

Language:Python2500

zotero-actions-tags

Customize your Zotero workflow.

Language:TypeScriptAGPL-3.0160200

wangth2001

Tianhao Wang's starred repositories

lhotse

fairseq

VoxTube

hw_seckill

SLT22_MultiHead-Factorized-Attentive-Pooling

Interspeech23_SelfPretraining

enskd

Diff-SV

AdvSV.github.io