YenTing (Adam) Lin's starred repositories
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
insanely-fast-whisper
Incredibly fast Whisper-large-v3
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Megatron-LLM
distributed trainer for LLMs
whisper-jax
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
c4-dataset-script
Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.
alignment-handbook
Robust recipes to align language models with human and AI preferences
tiny-openai-whisper-api
OpenAI Whisper API-style local server, runnig on FastAPI
awesome-mixture-of-experts
A collection of AWESOME things about mixture-of-experts
Taiwan-LLM
Traditional Mandarin LLMs for Taiwan
traditional_chinese_llama2
finetune llama2 with traditional chinese dataset
traditional-chinese-alpaca
A Traditional-Chinese instruction-following model with datasets based on Alpaca.
deep_learning_curriculum
Language model alignment-focused deep learning curriculum
langchain-ask-the-doc
Ask the Doc app built using Langchain and Streamlit.
show-me-chatgpt-plugin
Create and edit diagrams in ChatGPT
longeval-summarization
Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https://arxiv.org/abs/2301.13298).
awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
LLMsPracticalGuide
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
text-generation-inference
Large Language Model Text Generation Inference
arxiv-latex-cleaner
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
awesome-chatgpt-dataset
Unlock the Power of LLM: Explore These Datasets to Train Your Own ChatGPT!
chatbot-ui
AI chat for every model.