gachaun's repositories
audiomentations
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
awesome-LLM-resourses
🧑🚀 全世界最好的中文LLM资料总结
Awesome-MLLM-Hallucination
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
Bert-VITS2
vits2 backbone with multilingual-bert
cat-catch
猫抓 浏览器资源嗅探扩展 / cat-catch Browser Resource Sniffing Extension
CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
DeepFaceLive
Real-time face swap for PC streaming or video calls
espnet
End-to-End Speech Processing Toolkit
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
gpt4free
The official gpt4free repository | various collection of powerful language models
NativeSpeaker
make your Speaker talking as Native style with own voice!
NDK_OpenGLES_3_0
Android OpenGL ES 3.0 从入门到精通系统性学习教程
OpenGLCamera2
🔥 Android OpenGL Camera 2.0 实现 30 多种滤镜和抖音特效
PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
speechbrain
A PyTorch-based Speech Toolkit
vits_chinese-1
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support streaming out!
interview
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。
mnist-onnx-runtime
MoE model with onnx runtime
roop-cam
real time face swap and one-click video deepfake with only a single image (Uncensored)
sherpa-onnx
Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript
stable-ts
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
talking-face-arxiv-daily
🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
vits-simple-api
A simple VITS HTTP API, developed by extending Moegoe with additional features.
voice_datasets
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).