eehuahua

followers

0

following

stars

eehuahua's starred repositories

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonMIT145400

gpuRIR

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

Language:CudaAGPL-3.048800

chineseocr

yolo3+ocr

Language:PythonMIT594000

uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Language:PythonApache-2.0155700

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0161400

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT627500

MVSNet_pytorch

PyTorch Implementation of MVSNet

Language:Python61700

CasMVSNet_pl

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching using pytorch-lightning

Language:Jupyter NotebookGPL-3.027800

ADOP

Language:C++MIT202300

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonApache-2.01111900

FastASR

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。

Language:CApache-2.048600

FACEGOOD-Audio2Face

http://www.facegood.cc

Language:PythonMIT182200

yry

Language:PythonApache-2.01600

menpo3d

Tools for manipulating 3D meshes within the Menpo project.

Language:PythonNOASSERTION16500

non_rigid_icp

Modified version of non-rigid Iterative closest point algorithm for fitting to noisy point clouds

Language:Python8100

ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Language:PythonMIT60500

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03877700

DualStyleGAN

[CVPR 2022] Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

Language:Jupyter NotebookNOASSERTION164200

face-alignment

:fire: 2D and 3D Face alignment library build using pytorch

Language:PythonBSD-3-Clause708200

Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)

Language:PythonMIT220900

MeInGame

MeInGame: Create a Game Character Face from a Single Portrait, AAAI 2021

Language:PythonMIT65300

book-text-to-speech

A book about Text-to-Speech (TTS) in Chinese.

Language:TeXApache-2.058500

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT2012000

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonApache-2.059900

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonApache-2.094800

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.0847500

icefall

Language:PythonApache-2.092400

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaApache-2.0112600

DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Language:PythonCC-BY-4.0110200

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.0416300