Haoxiang Wang's starred repositories
reward-bench
RewardBench: the first evaluation tool for reward models.
bedirt.github.io
My personal website
hugo-PaperMod
A fast, clean, responsive Hugo theme.
CodeUltraFeedback
CodeUltraFeedback: aligning large language models to coding preferences
2025QuantInternships
Public quant internship repository, maintained by NUFT but available for everyone.
RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Directional-Preference-Alignment
Directional Preference Alignment
flash-attention
Fast and memory-efficient exact attention
prometheus
[ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically designed for fine-grained evaluation on a customized score rubric, Prometheus is a good alternative for human evaluation and GPT-4 evaluation.
LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
lm-evaluation-harness
A framework for few-shot evaluation of language models.
tensor_parallel
Automatically split your PyTorch models on multiple GPUs for training & inference
mint-bench
Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Zihan Wang*, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng and Heng Ji.