Liu Jun's repositories
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
ecapture
Capture SSL/TLS text content without a CA certificate using eBPF. This tool is compatible with Linux/Android x86_64/Aarch64.
prize
A prize for finding tasks that cause large language models to show inverse scaling
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
StableLM
StableLM: Stability AI Language Models
pytorch_optimizer
optimizer & lr scheduler collections in PyTorch
lilac
Analyze, structure and clean unstructured data with AI
pythainlp-corpus
pythainlp-data
FunASR
A Fundamental End-to-End Speech Recognition Toolkit
BlonDe
Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus
LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
transfomers-silicon-research
Research and Materials on Hardware implementation of Transformer Model
NeMo
NeMo: a toolkit for conversational AI
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
faster-whisper
Faster Whisper transcription with CTranslate2
whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Chat-Haruhi-Suzumiya
Chat凉宫春日, 由李鲁鲁, 冷子昂等同学开发的模仿二次元对话的聊天机器人。
speechbrain
A PyTorch-based Speech Toolkit
AgentSims
AgentSims is an easy-to-use infrastructure for researchers from all disciplines to test the specific capacities they are interested in.
ecco
Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
llama
Inference code for LLaMA models
ml-pretrained
Implementations of various pre-trained models
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
bitsandbytes
8-bit CUDA functions for PyTorch
llm-reasoners
A library for advanced large language model reasoning
open-muse
Open reproduction of MUSE for fast text2image generation.
mt3
MT3: Multi-Task Multitrack Music Transcription
gigagan-pytorch
Implementation of GigaGAN, new SOTA GAN out of Adobe. Culmination of nearly a decade of research into GANs