MlWoo

MlWoo

Geek Repo

Location:Beijing

Github PK Tool:Github PK Tool

MlWoo's starred repositories

llama

Inference code for LLaMA models

Language:PythonLicense:NOASSERTIONStargazers:50895Issues:499Issues:872

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:35054Issues:321Issues:430

tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonLicense:Apache-2.0Stargazers:24829Issues:194Issues:3950

DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Language:PythonLicense:MITStargazers:11025Issues:120Issues:210

seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10689Issues:140Issues:343

pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Language:PythonLicense:Apache-2.0Stargazers:5790Issues:210Issues:308

Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

vall-e

An unofficial PyTorch implementation of the audio LM VALL-E

Language:PythonLicense:MITStargazers:2927Issues:89Issues:97

audio-ai-timeline

A timeline of the latest AI models for audio generation, starting in 2023!

wer_are_we

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:1625Issues:33Issues:632

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Language:PythonLicense:MITStargazers:1086Issues:26Issues:72

versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Language:PythonLicense:MITStargazers:1058Issues:24Issues:53

conditional-flow-matching

TorchCFM: a Conditional Flow Matching library

Language:PythonLicense:MITStargazers:997Issues:14Issues:47

lhotse

Tools for handling speech data in machine learning projects.

Language:PythonLicense:Apache-2.0Stargazers:921Issues:44Issues:407

tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:768Issues:27Issues:125

fairseq2

FAIR Sequence Modeling Toolkit 2

Language:PythonLicense:MITStargazers:660Issues:18Issues:98

Meta-voicebox

Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.

deep-vector-quantization

VQVAEs, GumbelSoftmaxes and friends

Language:Jupyter NotebookLicense:MITStargazers:516Issues:14Issues:7

SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Language:PythonLicense:Apache-2.0Stargazers:400Issues:15Issues:11
Language:PythonLicense:Apache-2.0Stargazers:341Issues:16Issues:25

Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Language:PythonLicense:MITStargazers:319Issues:13Issues:20

DL-Art-School

TorToiSe fine-tuning with DLAS

Language:PythonLicense:AGPL-3.0Stargazers:208Issues:15Issues:62

RAM-multiprocess-dataloader

Demystify RAM Usage in Multi-Process Data Loaders

Language:PythonLicense:Apache-2.0Stargazers:169Issues:8Issues:9

USLM

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)

d2c

PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

Language:PythonLicense:MITStargazers:120Issues:4Issues:6

ASR-Benchmarks

An effort to track benchmarking results over widely-used datasets for ASR.

DistSup

Representation learning for NLP @ JSALT19

Language:PythonLicense:Apache-2.0Stargazers:34Issues:7Issues:1