gachaun

followers

following

stars

gachaun's repositories

audiomentations

A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Language:PythonMIT000

awesome-LLM-resourses

🧑‍🚀 全世界最好的中文LLM资料总结

000

Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

000

Bert-VITS2

vits2 backbone with multilingual-bert

Language:PythonAGPL-3.0000

cat-catch

猫抓浏览器资源嗅探扩展 / cat-catch Browser Resource Sniffing Extension

Language:JavaScriptGPL-3.0000

CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Language:PythonApache-2.0000

Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Language:PythonAGPL-3.0000

DeepFaceLive

Real-time face swap for PC streaming or video calls

Language:PythonGPL-3.0000

espnet

End-to-End Speech Processing Toolkit

Language:PythonApache-2.0000

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT000

gpt4free

The official gpt4free repository | various collection of powerful language models

Language:PythonGPL-3.0000

myheygenbeifen

Language:Python000

NativeSpeaker

make your Speaker talking as Native style with own voice！

Language:PythonApache-2.0000

NDK_OpenGLES_3_0

Android OpenGL ES 3.0 从入门到精通系统性学习教程

Language:C++Apache-2.0000

OpenGLCamera2

🔥 Android OpenGL Camera 2.0 实现 30 多种滤镜和抖音特效

Language:C++000

PL-BERT

Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions

Language:PythonMIT000

so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Language:PythonNOASSERTION000

speechbrain

A PyTorch-based Speech Toolkit

Language:PythonApache-2.0000

vits_chinese-1

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support streaming out!

Language:PythonMIT000

interview

📚 C/C++ 技术面试基础知识总结，包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。

NOASSERTION000

mnist-onnx-runtime

MoE model with onnx runtime

000

roop-cam

real time face swap and one-click video deepfake with only a single image (Uncensored)

AGPL-3.0000

sherpa-onnx

Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript

Apache-2.0000

stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

MIT000

talking-face-arxiv-daily

🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.

Language:PythonApache-2.0000

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Language:Jupyter NotebookApache-2.0000

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.0000

TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonApache-2.0000

vits-simple-api

A simple VITS HTTP API, developed by extending Moegoe with additional features.

Language:PythonAGPL-3.0000

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

000