wanghuii1

followers

following

stars

HangZhou, China

Hui Wang's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0129522 1121 15256

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.027037 184 4333

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.08139 179 2332

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonMIT7947 150 532

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.03946 91 1001

rnnoise

Recurrent neural network for audio noise reduction

Language:CBSD-3-Clause3875 148 194

scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language:PythonApache-2.03154 39 243

FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Language:PythonMIT2803 27 68

Awesome-Learning-with-Label-Noise

A curated list of resources for Learning with Noisy Labels

DeepFilterNet

Noise supression using deep filtering

Language:PythonNOASSERTION2201 32 267

pytorch-lightning-template

An easy/swift-to-adapt PyTorch-Lighting template. 套壳模板，简单易用，稍改原来Pytorch代码，即可适配Lightning。You can translate your previous Pytorch code much easier using this template, and keep your freedom to edit all the functions as well. Big-project-friendly as well. No need to rewrite your config in hydra.

Language:Jupyter NotebookApache-2.01268 9 13

onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.

Language:C++MIT1072 40 147

voxceleb_trainer

In defence of metric learning for speaker recognition

Language:PythonMIT1007 30 172

torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Language:PythonMIT905 11 105

DANN

pytorch implementation of Domain-Adversarial Training of Neural Networks

Language:PythonMIT819 10 18

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Language:Python402 6 54

Dual-Path-RNN-Pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch

Language:PythonApache-2.0401 4 60

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonMIT354 18 12

ava-dataset

The AVA dataset densely annotates 80 atomic visual actions in 351k movie clips with actions localized in space and time, resulting in 1.65M action labels with multiple labels per human occurring frequently.

CMGAN

Conformer-based Metric GAN for speech enhancement

Language:PythonMIT285 9 44

dscore

Diarization scoring tools.

Language:PythonBSD-2-Clause203 8 4

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

voxconverse

Spot the conversation: speaker diarisation in the wild

Auto-Tuning-Spectral-Clustering

This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"

Language:PythonMIT101 7 2

MossFormer2

This is the audio sample repository for speech separation model "MossFormer2".

Language:PythonMIT5800

speaker_extraction_SpEx

multi-scale time domain speaker extraction

Language:PythonGPL-3.054 3 5

EEND_dataprep

Language:Shell43 5 8

MSDWILD

[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.

Language:HTMLNOASSERTION32 4 3

ScriptsForVoxBlink

A repo containing download guidance and corresponding scripts of the VoxBlink dataset.

Language:PythonNOASSERTION17 1 4