QinLuo's repositories
libaio
import from https://pagure.io/libaio.git
textbook_quality
Generate textbook-quality LLM pretraining data
Memex
Browser extension to curate, annotate, and discuss the most valuable content and ideas on the web. As individuals, teams and communities.
tqdm-loggable
Logging friendly progress messages for TQDM progress bars
streaming-llm
Efficient Streaming Language Models with Attention Sinks
carton
Run any ML model from any programming language.
h2o-llmstudio
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/
ExpertQA
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
bytepiece
更纯粹、更高压缩率的Tokenizer
evol-teacher
Open Source WizardCoder Dataset
multipack_sampler
Multipack distributed sampler for fast padding-free training of LLMs
Mr.-Ranedeer-AI-Tutor
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.
LOMO
LOMO: LOw-Memory Optimization
tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
whisper
Robust Speech Recognition via Large-Scale Weak Supervision
GPT-4-LLM
Instruction Tuning with GPT-4
streaming
A Data Streaming Library for Efficient Neural Network Training
Megatron-LM
Ongoing research training transformer models at scale
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
chemnlp
ChemNLP project
MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
MOSS
An open-source tool-augmented conversational language model from Fudan University
rl_a3c_pytorch
A3C LSTM Atari with Pytorch plus A3G design
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools