Victor Chen (xjchenGit)

xjchenGit

Geek Repo

Company:National Taiwan University

Location:Taipei, Taiwan

Home Page:xjchen.tech

Twitter:@xjchen_ntu

Github PK Tool:Github PK Tool

Victor Chen's starred repositories

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:26547Issues:178Issues:839

leedl-tutorial

《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10090Issues:257Issues:77

jukebox

Code for the paper "Jukebox: A Generative Model for Music"

Language:PythonLicense:NOASSERTIONStargazers:7650Issues:305Issues:258

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Language:PythonLicense:MITStargazers:4356Issues:88Issues:10

encodec

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Language:PythonLicense:MITStargazers:3246Issues:57Issues:70

s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Language:PythonLicense:Apache-2.0Stargazers:2128Issues:45Issues:386

omnizart

Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.

Language:PythonLicense:MITStargazers:1581Issues:25Issues:75

mt3

MT3: Multi-Task Multitrack Music Transcription

Language:PythonLicense:Apache-2.0Stargazers:1329Issues:26Issues:88

madmom

Python audio and music signal processing library

Language:PythonLicense:NOASSERTIONStargazers:1257Issues:43Issues:263

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1032Issues:19Issues:129

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonLicense:MITStargazers:951Issues:25Issues:56

TruthfulQA

TruthfulQA: Measuring How Models Imitate Human Falsehoods

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:520Issues:8Issues:10

mixup

Implementation of the mixup training method

Language:PythonLicense:BSD-3-ClauseStargazers:457Issues:7Issues:8

awesome-large-audio-models

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

all-in-one

All-In-One Music Structure Analyzer

Language:PythonLicense:MITStargazers:341Issues:10Issues:6

TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Language:PythonLicense:MITStargazers:330Issues:6Issues:16

midi-ddsp

Synthesis of MIDI with DDSP (https://midi-ddsp.github.io/)

Language:PythonLicense:Apache-2.0Stargazers:295Issues:11Issues:16

audioldm_eval

This toolbox aims to unify audio generation model evaluation for easier comparison.

Language:PythonLicense:MITStargazers:271Issues:5Issues:7

ML_Practice

ML Records in 1110 Lab of BUPT. Some detailed information can be referenced on: https://mathpretty.com/10388.html

Language:Jupyter NotebookLicense:MITStargazers:227Issues:5Issues:3

Efficient_Foundation_Model_Survey

Survey Paper List - Efficient LLM and Foundation Models

Beat-Transformer

Codes for ISMIR 2022 paper: Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention

Language:PythonLicense:MITStargazers:82Issues:3Issues:18

MARBLE-Benchmark

Music Audio Representation Benchmark for Universal Evaluation

Language:PythonLicense:MITStargazers:71Issues:2Issues:11
Language:PythonLicense:Apache-2.0Stargazers:69Issues:3Issues:0

TONet

The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music"

benadar293.github.io

Unaligned Supervision for Automatic Music Transcription in The Wild

Language:JavaScriptLicense:NOASSERTIONStargazers:34Issues:0Issues:0

SMC-Bench

[ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, AJAY KUMAR JAISWAL, Zhangyang Wang

Language:PythonStargazers:25Issues:11Issues:0

MTDVocaLiST

Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization.

Language:PythonStargazers:16Issues:0Issues:0
Language:Jupyter NotebookStargazers:7Issues:0Issues:0

awesome-audio-visual-deepfake

awesome-audio-visual-robustness