Victor Chen's starred repositories
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
leedl-tutorial
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
TruthfulQA
TruthfulQA: Measuring How Models Imitate Human Falsehoods
awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
all-in-one
All-In-One Music Structure Analyzer
audioldm_eval
This toolbox aims to unify audio generation model evaluation for easier comparison.
ML_Practice
ML Records in 1110 Lab of BUPT. Some detailed information can be referenced on: https://mathpretty.com/10388.html
Efficient_Foundation_Model_Survey
Survey Paper List - Efficient LLM and Foundation Models
Beat-Transformer
Codes for ISMIR 2022 paper: Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention
MARBLE-Benchmark
Music Audio Representation Benchmark for Universal Evaluation
benadar293.github.io
Unaligned Supervision for Automatic Music Transcription in The Wild
MTDVocaLiST
Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization.
awesome-audio-visual-deepfake
awesome-audio-visual-robustness