xjchenGit

followers

following

stars

National Taiwan University

Taipei, Taiwan

Victor Chen's starred repositories

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonMIT26547 178 839

leedl-tutorial

《李宏毅深度学习教程》（李宏毅老师推荐👍），PDF下载地址：https://github.com/datawhalechina/leedl-tutorial/releases

Language:Jupyter NotebookNOASSERTION10090 257 77

jukebox

Code for the paper "Jukebox: A Generative Model for Music"

Language:PythonNOASSERTION7650 305 258

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonMIT4356 88 10

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonMIT3246 57 70

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonApache-2.02128 45 386

omnizart

Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.

Language:PythonMIT1581 25 75

mt3

MT3: Multi-Task Multitrack Music Transcription

Language:PythonApache-2.01329 26 88

madmom

Python audio and music signal processing library

Language:PythonNOASSERTION1257 43 263

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookBSD-3-Clause1032 19 129

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonMIT951 25 56

TruthfulQA

TruthfulQA: Measuring How Models Imitate Human Falsehoods

Language:Jupyter NotebookApache-2.0520 8 10

mixup

Implementation of the mixup training method

Language:PythonBSD-3-Clause457 7 8

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

Language:Python354 2 12

all-in-one

All-In-One Music Structure Analyzer

Language:PythonMIT341 10 6

TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Language:PythonMIT330 6 16

midi-ddsp

Synthesis of MIDI with DDSP (https://midi-ddsp.github.io/)

Language:PythonApache-2.0295 11 16

audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

Language:PythonMIT271 5 7

ML_Practice

ML Records in 1110 Lab of BUPT. Some detailed information can be referenced on: https://mathpretty.com/10388.html

Language:Jupyter NotebookMIT227 5 3

Efficient_Foundation_Model_Survey

Survey Paper List - Efficient LLM and Foundation Models

Beat-Transformer

Codes for ISMIR 2022 paper: Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention

Language:PythonMIT82 3 18

MARBLE-Benchmark

Music Audio Representation Benchmark for Universal Evaluation

Language:PythonMIT71 2 11

FactualityPrompt

Language:PythonApache-2.069 30

TONet

The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music"

Language:Python40 4 7

benadar293.github.io

Unaligned Supervision for Automatic Music Transcription in The Wild

Language:JavaScriptNOASSERTION3400

SMC-Bench

[ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, AJAY KUMAR JAISWAL, Zhangyang Wang

Language:Python25 110

MTDVocaLiST

Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization.

Language:Python1600

LLMs-fall-23

Language:Jupyter Notebook700

awesome-audio-visual-deepfake

awesome-audio-visual-robustness

7 10