Hui Wang (wanghuii1)

wanghuii1

Geek Repo

Location:HangZhou, China

Github PK Tool:Github PK Tool

Hui Wang's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:129522Issues:1121Issues:15256

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:27037Issues:184Issues:4333

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8139Issues:179Issues:2332

demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Language:PythonLicense:MITStargazers:7947Issues:150Issues:532

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:3946Issues:91Issues:1001

rnnoise

Recurrent neural network for audio noise reduction

Language:CLicense:BSD-3-ClauseStargazers:3875Issues:148Issues:194

scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language:PythonLicense:Apache-2.0Stargazers:3154Issues:39Issues:243

FunClip

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

Language:PythonLicense:MITStargazers:2803Issues:27Issues:68

Awesome-Learning-with-Label-Noise

A curated list of resources for Learning with Noisy Labels

DeepFilterNet

Noise supression using deep filtering

Language:PythonLicense:NOASSERTIONStargazers:2201Issues:32Issues:267

pytorch-lightning-template

An easy/swift-to-adapt PyTorch-Lighting template. 套壳模板,简单易用,稍改原来Pytorch代码,即可适配Lightning。You can translate your previous Pytorch code much easier using this template, and keep your freedom to edit all the functions as well. Big-project-friendly as well. No need to rewrite your config in hydra.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1268Issues:9Issues:13

onnxruntime-inference-examples

Examples for using ONNX Runtime for machine learning inferencing.

Language:C++License:MITStargazers:1072Issues:40Issues:147

voxceleb_trainer

In defence of metric learning for speaker recognition

Language:PythonLicense:MITStargazers:1007Issues:30Issues:172

torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

Language:PythonLicense:MITStargazers:905Issues:11Issues:105

DANN

pytorch implementation of Domain-Adversarial Training of Neural Networks

Language:PythonLicense:MITStargazers:819Issues:10Issues:18

speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

Conv-TasNet

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Dual-Path-RNN-Pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch

Language:PythonLicense:Apache-2.0Stargazers:401Issues:4Issues:60

SLAM-LLM

Speech, Language, Audio, Music Processing with Large Language Model

Language:PythonLicense:MITStargazers:354Issues:18Issues:12

ava-dataset

The AVA dataset densely annotates 80 atomic visual actions in 351k movie clips with actions localized in space and time, resulting in 1.65M action labels with multiple labels per human occurring frequently.

CMGAN

Conformer-based Metric GAN for speech enhancement

Language:PythonLicense:MITStargazers:285Issues:9Issues:44

dscore

Diarization scoring tools.

Language:PythonLicense:BSD-2-ClauseStargazers:203Issues:8Issues:4

Awesome-Speaker-Diarization

Some comprehensive papers about speaker diarization

voxconverse

Spot the conversation: speaker diarisation in the wild

Auto-Tuning-Spectral-Clustering

This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"

Language:PythonLicense:MITStargazers:101Issues:7Issues:2

MossFormer2

This is the audio sample repository for speech separation model "MossFormer2".

Language:PythonLicense:MITStargazers:58Issues:0Issues:0

speaker_extraction_SpEx

multi-scale time domain speaker extraction

Language:PythonLicense:GPL-3.0Stargazers:54Issues:3Issues:5

MSDWILD

[INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.

Language:HTMLLicense:NOASSERTIONStargazers:32Issues:4Issues:3

ScriptsForVoxBlink

A repo containing download guidance and corresponding scripts of the VoxBlink dataset.

Language:PythonLicense:NOASSERTIONStargazers:17Issues:1Issues:4