open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Home Page:https://opencompass.org.cn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🔥 Collaborate to Enhance the OpenCompass : Introducing Diverse NLP Dataset

liushz opened this issue · comments

描述该功能

Hello everyone!

In order to better cater to the needs of our community, we plan to incorporate a range of diverse NLP datasets into the platform. These datasets will cover various domains such as text classification, natural language inference, coreference resolution, semantic matching, question answering, text generation, and more. This enhancement aims to provide a more comprehensive and in-depth assessment of models.

Below is the list of NLP datasets we intend to add, along with their respective categories, datasets with ⭐️ are Highly Demanded:

Text Classification:

Natural Language Inference:

  • XNLI (Supports both Chinese and English): Link
  • WNLI: Link
  • QNLI: Link

Coreference Resolution:

  • xwinograd (Supports both Chinese and English): Link

Semantic Matching:

  • QQP: Link ⭐️
  • MRPC: Link ⭐️
  • PAWS-X (Supports both Chinese and English): Link

General Question Answering:

Multi-Turn Question Answering:

Knowledge Question Answering:

Reasoning Question Answering:

Safety, Ethics, and Morality:

  • Ethics: Link ⭐️

Text Generation:

We warmly welcome every member of the OpenCompass community to actively participate and collaborate in seamlessly integrating these datasets into our evaluation platform. Here's how you can get involved:

  1. Choose Datasets: In the comments, let us know which datasets you're interested in helping to add, or provide suggestions for existing ones.
  2. Dataset Information: If you're aware of relevant links, descriptions, licensing details, etc., please do share them as it will greatly aid our integration efforts. Here's a template to follow:
   Name: GSM8k
   Link: [https://github.com/openai/grade-school-math](https://github.com/openai/grade-school-math)
   Introduction: GSM8K is a grade school math question answering task, which requires selecting the most reasonable solution based on the given scenario and two possible solutions. The dataset consists of 16k training samples, 800 development samples, and 2k test samples, all in English.
   Sample: 
   {
     "question": "If f(x) = x^2 + 3x - 2, what is f(-1)?",
     "answer": "f(-1) = (-1)^2 + 3(-1) - 2\nf(-1) = 1 - 3 - 2\nf(-1) = -4"
   }
   License: MIT License
  1. New Dataset Suggestions: If you have other suitable dataset recommendations, feel free to share your insights in the comments.

How to add?

Adding a new dataset involves several steps:

  1. Documentation: Please visit the documentation which provides a step-by-step guide on how to add a new dataset.

  2. Check Input & Output: Once your new dataset config is ready, use the Prompt Viewer tool in OpenCompass to easily check the Input & Output.

  3. Preparation: Follow the documentation and write code, then create the corresponding Pull Request. If you are not familiar with Pull Requests, this Contribution Guide may help you :>

  4. Description: In your Pull Request, provide a detailed description of the datasets you intend to add, along with the relevant links, descriptions, licensing information, and the bilingual content you shared earlier in this issue. Submit your Pull Request. Our community reviewers will then assess your contribution and provide feedback.

By following these steps, you can actively contribute to enriching the OpenCompass evaluation platform with new and valuable datasets. If you encounter any issues during the process or need further assistance, feel free to ask. We appreciate your dedication to making OpenCompass a more diverse and comprehensive resource for NLP model evaluation. Happy contributing!

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!