ieyniie's repositories
stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
stable-diffusion-webui
Stable Diffusion web UI
slidev
Presentation Slides for Developers
jetson-containers
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
ATST-SED
This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".
C8DASR-Baseline-NeMo
NeMo: a toolkit for conversational AI
OpenVoice
Instant voice cloning by MyShell.
SRP-DNN
A python implementation of “SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization” [ICASSP 2022]
speechbrain
A PyTorch-based Speech Toolkit
TTS_coqui
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
faceswap
Deepfakes Software For All
VoiceprintRecognition-Pytorch
本项目使用了EcapaTdnn模型实现的声纹识别
pygsound
Impulse response generation based on state-of-the-art geometric sound propagation engine.
asteroid
The PyTorch-based audio source separation toolkit for researchers
VITS-Pytorch
本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,直接一键训练和生成,大大降低了学习门槛。
StableVideo
[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
TAC
transform-average-concatenate (TAC) method for end-to-end microphone permutation and number invariant ad-hoc beamforming.
NBSS
The official repo of NBC & SpatialNet
sudo_rm_rf
Code for SuDoRm-Rf networks for efficient audio source separation. SuDoRm-Rf stands for SUccessive DOwnsampling and Resampling of Multi-Resolution Features which enables a more efficient way of separating sources from mixtures.
SSSfastMNMF
The code for multi-channel source separation and dereverberation such as FastMNMF1, FastMNMF2, and AR-FastMNMF2.
Beam-Guided-TasNet
Beam-guided TasNet
odas
ODAS: Open embeddeD Audition System
FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
IR-GAN
Augmenting Room Impulse Response
Beamforming-for-speech-enhancement
simple delaysum, MVDR and CGMM-MVDR
nn-gev
Neural network supported GEV beamformer