eehuahua

eehuahua

Geek Repo

Github PK Tool:Github PK Tool

eehuahua's starred repositories

pyroomacoustics

Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Language:PythonLicense:MITStargazers:1454Issues:0Issues:0

gpuRIR

Python library for Room Impulse Response (RIR) simulation with GPU acceleration

Language:CudaLicense:AGPL-3.0Stargazers:488Issues:0Issues:0

chineseocr

yolo3+ocr

Language:PythonLicense:MITStargazers:5940Issues:0Issues:0

uis-rnn

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Language:PythonLicense:Apache-2.0Stargazers:1557Issues:0Issues:0

awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

License:Apache-2.0Stargazers:1614Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:6275Issues:0Issues:0

MVSNet_pytorch

PyTorch Implementation of MVSNet

Language:PythonStargazers:617Issues:0Issues:0

CasMVSNet_pl

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching using pytorch-lightning

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:278Issues:0Issues:0
Language:C++License:MITStargazers:2023Issues:0Issues:0

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonLicense:Apache-2.0Stargazers:11119Issues:0Issues:0

FastASR

这是一个用C++实现ASR推理的项目,它依赖很少,安装也很简单,推理速度很快,在树莓派4B等ARM平台也可以流畅的运行。 支持的模型是由Google的Transformer模型中优化而来,数据集是开源wenetspeech(10000+小时)或阿里私有数据集(60000+小时), 所以识别效果也很好,可以媲美许多商用的ASR软件。

Language:CLicense:Apache-2.0Stargazers:486Issues:0Issues:0

FACEGOOD-Audio2Face

http://www.facegood.cc

Language:PythonLicense:MITStargazers:1822Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:16Issues:0Issues:0

menpo3d

Tools for manipulating 3D meshes within the Menpo project.

Language:PythonLicense:NOASSERTIONStargazers:165Issues:0Issues:0

non_rigid_icp

Modified version of non-rigid Iterative closest point algorithm for fitting to noisy point clouds

Language:PythonStargazers:81Issues:0Issues:0

ECAPA-TDNN

Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)

Language:PythonLicense:MITStargazers:605Issues:0Issues:0

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38777Issues:0Issues:0

DualStyleGAN

[CVPR 2022] Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1642Issues:0Issues:0

face-alignment

:fire: 2D and 3D Face alignment library build using pytorch

Language:PythonLicense:BSD-3-ClauseStargazers:7082Issues:0Issues:0

Deep3DFaceReconstruction

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019)

Language:PythonLicense:MITStargazers:2209Issues:0Issues:0

MeInGame

MeInGame: Create a Game Character Face from a Single Portrait, AAAI 2021

Language:PythonLicense:MITStargazers:653Issues:0Issues:0

book-text-to-speech

A book about Text-to-Speech (TTS) in Chinese.

Language:TeXLicense:Apache-2.0Stargazers:585Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:20120Issues:0Issues:0

asv-subtools

An Open Source Tools for Speaker Recognition

Language:PythonLicense:Apache-2.0Stargazers:599Issues:0Issues:0

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonLicense:Apache-2.0Stargazers:948Issues:0Issues:0

espnet

End-to-End Speech Processing Toolkit

Language:PythonLicense:Apache-2.0Stargazers:8475Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:924Issues:0Issues:0

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Language:CudaLicense:Apache-2.0Stargazers:1126Issues:0Issues:0

DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Language:PythonLicense:CC-BY-4.0Stargazers:1102Issues:0Issues:0

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

Language:PythonLicense:Apache-2.0Stargazers:4163Issues:0Issues:0