Qian's repositories
Persona-Dialogue-Generation
The code of ACL 2020 paper "You Impress Me: Dialogue Generation via Mutual Persona Perception"
code-html-to-markdown
A lightweight script for processing HTML page to markdown format with support for code blocks
OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
santacoder-finetuning-commit
Fine-tune SantaCoder for Code/Text Generation.
simpleRL-reason
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
axolotl
Go ahead and axolotl questions
bytepiece
更纯粹、更高压缩率的Tokenizer
dclm
DataComp for Language Models
dl4c.github.io
Deep Learning for Code Website
dl4c.github.io-1
✨ Build a beautiful and simple website in literally minutes. Demo at https://beautifuljekyll.com
extract-expert
Extract a single expert from an MoE model of Mixtral architecture, using slerp
Megatron-LLM
distributed trainer for LLMs
oat
🌾 OAT: Online AlignmenT for LLMs
OpenAgents
OpenAgents: An Open Platform for Language Agents in the Wild
Precision-RL
Defeating the Training-Inference Mismatch via FP16
sailcraft
Data Toolkit for Sailor Language Models
Triton-Puzzles
Puzzles for learning Triton
verl-pipeline
Async pipelined version of Verl
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs