ChaeHun Park's starred repositories
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
faster-whisper
Faster Whisper transcription with CTranslate2
speechbrain
A PyTorch-based Speech Toolkit
bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.
LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
CTranslate2
Fast inference engine for Transformer models
py-webrtcvad
Python interface to the WebRTC Voice Activity Detector
Comprehensive-Transformer-TTS
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
DiffGAN-TTS
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Awesome-Machine-Generated-Text
Continuously updated list of related resources for generative LLMs like GPT and their analysis and detection.
AlignScore
AlignScore, a metric for factual consistency evaluation.
weakly-supervised-mVLP
Implementation of our ACL2023 paper: Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training