WhiteFu's starred repositories
awesome-chatgpt-prompts
This repo includes ChatGPT prompt curation to use ChatGPT better.
manga-image-translator
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
LLaMA-Efficient-Tuning
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon, Baichuan, Qwen, ChatGLM2)
CTranslate2
Fast inference engine for Transformer models
long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
tortoise-tts-fast
Fast TorToiSe inference (5x or your money back!)
language-detection
This is a language detection library implemented in plain Java. (aliases: language identification, language guessing)
UltraSinger
AI based tool to convert vocals lyrics and pitch from music to autogenerate Ultrastar Deluxe, Midi and notes. It automatic tapping, adding text, pitch vocals and creates karaoke files.
DL-Art-School
TorToiSe fine-tuning with DLAS
ai-audio-datasets-list
This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.
Easy-Translate
Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for advanced users.
RecAlgorithm
主流推荐系统Rank算法的实现
UnitSpeech
An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"
tortoise-tts-fastest
Faster Tortoise inference then Tortoise Fast Fork
PolyLangVITS
Multi-speaker Speech Synthesis Using VITS(KO, JA, EN, ZH)
laughter-synthesis
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" accepted by INTERSPEECH 2023.
NSF-BigVGAN
BigVGAN with Neural Source-Filter
CML-TTS-Dataset
CML-TTS: A Multilingual Dataset for Speech Synthesis
useful_audio_scripts
Some useful scripts for audio