ayulockin / wandb2kaggle

Automatically upload the model saved as W&B Artifact to Kaggle Dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

W&B 2 Kaggle

Who is this for?

Competiting in a Kaggle competition (seriously) requires to train a LOT of models with different hyperparameters and different splits of the dataset. You can use Weights and Biases to keep track of all your experiments but what about models trained and saved during every experiments. To version control the models and keep track of the dependencies Weights and Biases Artifacts can be used.

However, many of the Kaggle competitions are "code based" and requires the trained (fine-tuned) model(s) as Kaggle dataset. This can be a repetitive and prone to error.

Thus this action is primarily made for:

What does it do?

This action automates the process of uploading the latest version of the selected model saved as Artifacts as Kaggle Dataset.

When you merge a pull request, the GitHub action will kick off the workflow, where:

  • The model saved as W&B Artifact is downloaded,
  • automatically upload the model as Kaggle Dataset.

Prerequisite

In order to use the action you will have to have a W&B account (free) and Kaggle profile. You will have to put the W&B Access Token and Kaggle Username and Key as GitHub Secrets.

  • You can get your W&B Access Token by visiting: wandb.ai/authorize. Save it as GitHub Secrets with the key name WANDB_KEY.
  • You can get your Kaggle Username and Key by visiting the settings page of your Kaggle profile. Save the username as KAGGLE_USERNAME and key as KAGGLE_KEY.
  • You can learn more about Kaggle Secrets here.

How to use?

  • You can fork this repository and modify the input parameters in the main.yml workflow file located at .github/workflow/.
  • In your own repository you can create a workflow main.yml file and add the code snippet as shown below. Note that you will have to copy the download_model_artifact.py file to your repository as well.
- name: W&B Artifact to Kaggle Dataset
  uses: ayulockin/wandb2kaggle@v1

Inputs

artifact_name

You can find the artifact_name by visitng the API tab of a model artifact page as shown in the image below. Notice the use_artifact code snippet. Don't provide the :v0 as the action automatically uses the latest version of the artifact.

image

id

Dataset identifier in format {username}/{dataset}. username is your Kaggle username.

title

Title of the Kaggle dataset.

is_public

If true the created Kaggle dataset is public.

Note

  • The provided main.yml file triggers the workflow when a commit is made to the main branch. It would be a good practice to use Pull Request. Here's a list of events that can trigger a workflow.

  • Sometimes the workflow might fail. If it's a Dataset Creation Error as shown in the image below, manually visit Kaggle Dataset (Your Work section) to see if the dataset was created or not.

    image

Acknowledgement

The action uses Push kaggle dataset under the hood. Big shout out to Jaime Valero.

(* This is not an official Weights and Biases product.)

About

Automatically upload the model saved as W&B Artifact to Kaggle Dataset.

License:Apache License 2.0


Languages

Language:Python 100.0%