huggingface / hub-docs

Docs of the Hugging Face Hub

Home Page:http://hf.co/docs/hub

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[hacktoberfest] Model `no license` challenge

Wauplin opened this issue · comments

Issue to keep track of the "Model no license challenge" for the Hacktoberfest 2023.

Context

The Hugging Face Hub hosts hundreds of thousands of public models and datasets. Public doesn't necessarily mean open-source without any limitations. Authors can define which license applies to the work they share (e.g. MIT, Apache2.0, OpenRAIL, etc.). All users must be able to quickly know which license applies to which model and even to list models with a specific license (e.g. Apache2.0). The Hub relies on the Model Card to do so. A Model Card is a file attached to a model providing handy information. They are essential for discoverability, reproducibility and sharing. In our case, we will focus on the metadata section of the Model Card. This metadata contains valuable information, including a license tag.

In this challenge, we will focus on models that have no license defined but that have a LICENSE file in the repo. These are models for which the author actually cares about the model license but didn't make it searchable by authors.

There are 2 ways of defining a license tag. Either the license is one of the officially-supported licenses. In this case, simply defining it as a string in the metadata is enough:

# Example from https://huggingface.co/codellama/CodeLlama-34b-hf
---
license: llama2
---

Otherwise, the license is considered as other. In that case, we can set a custom name and a URL to the said license. Here is an example of how it looks like:

# Example from https://huggingface.co/coqui/XTTS-v1
---
license: other
license_name: coqui-public-model-license
license_link: https://coqui.ai/cpml
---

This challenge aims to improve the completeness of this metadata on the Hub, which will ultimately benefit all users.

Instructions

Check out the instructions details here.

Feel free to ping @davanstrien or @Wauplin for any question or review.