Tianhao Wang (wangth2001)

wangth2001

Geek Repo

Company:Beijing University of Posts and Telecommunications

Location:Beijing

Github PK Tool:Github PK Tool

Tianhao Wang's starred repositories

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonLicense:Apache-2.0Stargazers:898Issues:0Issues:0

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:29735Issues:0Issues:0

VoxTube

The VoxTube dataset official repository

Language:HTMLLicense:NOASSERTIONStargazers:57Issues:0Issues:0

hw_seckill

华为Mate系列手机自动抢购脚本,支持Mate60、Mate60Pro、Mate60Pro+、Mate X5等机型;支持以上机型选择颜色、版本。

Language:PythonLicense:GPL-3.0Stargazers:271Issues:0Issues:0

SLT22_MultiHead-Factorized-Attentive-Pooling

An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification

Language:PythonStargazers:9Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

enskd

Official implementation of the ICASSP 2024 paper: Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification

Language:PythonLicense:MITStargazers:13Issues:0Issues:0

Diff-SV

Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models

Language:PythonLicense:MITStargazers:17Issues:0Issues:0

AdvSV.github.io

AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. It aims to benchmark the robustness of ASV models in the face of such attacks and offers vital resources for researchers to explore the characteristics of adversarial and replay attacks in this domain.

Language:HTMLStargazers:10Issues:0Issues:0

VoiceprintRecognition-Pytorch

This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE, ERes2Net, CAM++, etc. It is not excluded that more models will be supported in the future. At the same time, this project also supports MelSpectrogram, Spectrogram data preprocessing methods

Language:PythonLicense:Apache-2.0Stargazers:680Issues:0Issues:0

Multilingual-PR

Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021) and WavLM (2022) pretrained on a corpus of English speech that we will use in various ways to perform phoneme recognition for different languages with a network trained with Connectionist Temporal Classification (CTC) algorithm.

Language:PythonStargazers:184Issues:0Issues:0
Language:PythonLicense:MITStargazers:423Issues:0Issues:0

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonLicense:Apache-2.0Stargazers:2151Issues:0Issues:0

w2v2-speaker

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053

Language:PythonLicense:MITStargazers:141Issues:0Issues:0

shadowsocksr

Python port of ShadowsocksR

Language:PythonLicense:Apache-2.0Stargazers:3314Issues:0Issues:0

BaiduPCS-Go

iikira/BaiduPCS-Go原版基础上集成了分享链接/秒传链接转存功能

Language:GoLicense:Apache-2.0Stargazers:2733Issues:0Issues:0

aliyunpan

阿里云盘命令行客户端,支持JavaScript插件,支持同步备份功能。

Language:GoLicense:Apache-2.0Stargazers:3855Issues:0Issues:0

ContextMenuManager

🖱️ 纯粹的Windows右键菜单管理程序

Language:C#License:GPL-3.0Stargazers:11295Issues:0Issues:0

clash_for_windows_pkg_backup

Clash for Windows 最后版本安装包备份

Stargazers:123Issues:0Issues:0

ScriptsForVoxBlink

A repo containing download guidance and corresponding scripts of the VoxBlink dataset.

Language:PythonLicense:NOASSERTIONStargazers:17Issues:0Issues:0

HahaPod

The repository for collecting HahaPod dataset.

Language:PythonLicense:NOASSERTIONStargazers:3Issues:0Issues:0

senet.pytorch

PyTorch implementation of SENet

Language:PythonLicense:MITStargazers:2257Issues:0Issues:0

External-Attention-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Language:PythonLicense:MITStargazers:11093Issues:0Issues:0

TIM-Net_SER

[ICASSP 2023] Official Tensorflow implementation of "Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition".

Language:PythonLicense:GPL-3.0Stargazers:153Issues:0Issues:0

auditok

An audio/acoustic activity detection and audio segmentation tool

Language:PythonLicense:MITStargazers:722Issues:0Issues:0

Toroidal-PSDA

A probabilistic scoring backend for length-normalized embeddings.

Language:PythonLicense:MITStargazers:10Issues:0Issues:0

CREMA-D

Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D)

Language:RLicense:NOASSERTIONStargazers:322Issues:0Issues:0

SpeechEmotionRecognition-emodb

Speech Emotion Recognition

Language:PythonStargazers:25Issues:0Issues:0

zotero-actions-tags

Customize your Zotero workflow.

Language:TypeScriptLicense:AGPL-3.0Stargazers:1602Issues:0Issues:0