AI-X-King's repositories
AlpacaDataCleaned
Alpaca dataset from Stanford, cleaned and curated
apps
one benchmark for llm coding
awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
axolotl
Go ahead and axolotl questions
CodeFormer
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
data-preparation
Code used for sourcing and cleaning the BigScience ROOTS corpus
data_management_LLM
Collection of training data management explorations for large language models
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
FasterTransformer
Transformer related optimization, including BERT, GPT
lhotse
Tools for handling speech data in machine learning projects.
LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
llama.cpp
LLM inference in C/C++
promptbase
All things prompt engineering
PSST
Prosodic Speech Segmentation with Transformers
pyctcdecode
A fast and lightweight python-based CTC beam search decoder for speech recognition.
pytorch-docker
Pure Pytorch Docker Images.
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
sherpa
Streaming and non-streaming ASR server for next-gen Kaldi
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
whisper-finetune
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.