Mddct

Dinghao Zhou's starred repositories

jaxloudnorm

Jax implementation of a flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm

Language:PythonMIT600

julius

Fast PyTorch based DSP for audio and 1D signals

Language:PythonMIT40900

dm_aux

Language:PythonApache-2.06100

audiotools

Object-oriented handling of audio data, with GPU-powered augmentations, and more.

Language:PythonMIT19100

DAC-JAX

A JAX Implementation of the Descript Audio Codec

Language:PythonMIT1500

jaxrl

JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.

Language:Jupyter NotebookMIT59300

multihost_dataloading

Experimenting with how best to do multi-host dataloading

Language:Python600

SOFA

SOFA: Singing-Oriented Forced Aligner

Language:PythonMIT7800

tokenizers

Go bindings for HuggingFace Tokenizer

Language:GoMIT6400

QQMusicSpider

基于Scrapy的QQ音乐爬虫(QQ Music Spider)，爬取歌曲信息、歌词、精彩评论等，并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料

Language:Python29500

legado

阅读APP书源

Language:HTML191400

SALMONN

SALMONN: Speech Audio Language Music Open Neural Network

Language:PythonApache-2.084500

SpeechGPT

SpeechGPT Series: Speech Large Language Models

Language:PythonApache-2.097800

Awesome-instruction-tuning

A curated list of awesome instruction tuning datasets, models, papers and repositories.

Language:PythonApache-2.026000

LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

MIT212900

ebook

电子书

27200

MAP-NEO

Language:Python66700

HA2G

[CVPR 2022] Code for "Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation"

Language:PythonGPL-3.012100

highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Language:C++Apache-2.0371300

iqiyi-parser

解析下载爱奇艺、哔哩哔哩、腾讯视频

Language:PythonMIT93800

TorchKMeans

A torch-based implementation of K-Means and K-Means++

Language:Python1600

torch_kmeans

PyTorch implementations of KMeans, Soft-KMeans and Constrained-KMeans which can be run on GPU and work on (mini-)batches of data.

Language:PythonMIT4300

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookNOASSERTION2230600

DeepFilterNet

Noise supression using deep filtering

Language:PythonNOASSERTION204800

examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

Language:PythonBSD-3-Clause2190300

xcodec

X-Codec: Unified Audio Tokenizer for Audio Language Model

1400

music-dl

Music Searcher and Downloader. - 音乐搜索下载器。

Language:PHPMIT61600

UMOE-Scaling-Unified-Multimodal-LLMs

The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"

Language:Python69900

NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Language:PythonBSD-3-Clause297900

audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

186900