eehuahua

followers

0

following

stars

eehuahua's starred repositories

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonMIT142800

gpuRIR

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

Language:CudaAGPL-3.048100

chineseocr

yolo3+ocr

Language:PythonMIT592400

uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Language:PythonApache-2.0155600

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Apache-2.0158300

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookMIT603000

MVSNet_pytorch

PyTorch Implementation of MVSNet

Language:Python61400

CasMVSNet_pl

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching using pytorch-lightning

Language:Jupyter NotebookGPL-3.027600

ADOP

Language:C++MIT202100

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonApache-2.01097900

FastASR

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平台也可以流畅的运行。支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时)，所以识别效果也很好，可以媲美许多商用的ASR软件。

Language:CApache-2.048200

FACEGOOD-Audio2Face

http://www.facegood.cc

Language:PythonMIT180600

yry

Language:PythonApache-2.01600

menpo3d

Tools for manipulating 3D meshes within the Menpo project.

Language:PythonNOASSERTION16500

non_rigid_icp

Modified version of non-rigid Iterative closest point algorithm for fitting to noisy point clouds

Language:Python8100

ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Language:PythonMIT59100

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.03869700

DualStyleGAN

[CVPR 2022] Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

Language:Jupyter NotebookNOASSERTION162400

face-alignment

:fire: 2D and 3D Face alignment library build using pytorch

Language:PythonBSD-3-Clause703300

Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)

Language:PythonMIT218300

MeInGame

MeInGame: Create a Game Character Face from a Single Portrait, AAAI 2021

Language:PythonMIT65300

book-text-to-speech

A book about Text-to-Speech (TTS) in Chinese.

Language:TeXApache-2.058000

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonMIT1963700

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonApache-2.059200

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonApache-2.093600

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.0835400

icefall

Language:PythonApache-2.090200

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaApache-2.0111300

DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Language:PythonCC-BY-4.0107600

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonApache-2.0409100