vtddggg's starred repositories
SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
PurpleLlama
Set of tools to assess and improve LLM security.
LLM-Conversation-Safety
[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
AlignBench
多维度中文对齐评测基准 | Benchmarking Chinese Alignment of LLMs
ModelAssess
中文竞技场模型大模型测评
transformers_tasks
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
zero-shot-reward-models
ZYN: Zero-Shot Reward Models with Yes-No Questions
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
llm-attacks
Universal and Transferable Attacks on Aligned Language Models
Safety-Prompts
Chinese safety prompts for evaluating and improving the safety of LLMs. 中文安全prompts,用于评估和提升大模型的安全性。