vtddggg

vtddggg's starred repositories

Auto-Arena-LLMs

Language:Jupyter NotebookApache-2.0900

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell496800

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookNOASSERTION2379400

TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Language:PythonMIT33600

S-Eval

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

NOASSERTION2000

LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Language:PythonMIT408800

SALAD-BENCH

【ACL 2024】 SALAD benchmark & MD-Judge

Language:PythonApache-2.06400

RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Language:PythonApache-2.030300

PurpleLlama

Set of tools to assess and improve LLM security.

Language:PythonNOASSERTION208900

LLM-Conversation-Safety

[NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey

4300

llmeval-2

中文大语言模型评测第二期

6600

z-bench

Z-Bench 1.0 by 真格基金：一个麻瓜的大语言模型中文测试集。Z-Bench is a LLM prompt dataset for non-technical users, developed by an enthusiastic AI-focused team in Zhenfund.

CC-BY-4.046100

QAnything

Question and Answer based on Anything.

Language:PythonApache-2.01016800

AlignBench

多维度中文对齐评测基准 | Benchmarking Chinese Alignment of LLMs

Language:Python22800

ModelAssess

中文竞技场模型大模型测评

400

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

MIT312500