chkla / multilabel-transformer

Tutorial on multilabel classification w/ Huggingface πŸ€— and AdapterHub πŸ€–

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multilabel Classification with Huggingface’ πŸ€— Trainer πŸ’ͺ and AdapterHub πŸ€–: A short Tutorial for Multilabel Classification with Language Models

If you are a fan of the HuggingFace API πŸ€—, you may have noticed the new trainer πŸ’ͺ class (introduced in version 2.9):

trainer = Trainer(
    model,
    args,
    train_dataset=TRAIN_DATA,
    eval_dataset=TEST_DATA,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

The trainer class provides an easy way to do the training process for tuning your own language model (e.g., BERT - if you have never heard of language models like BERT before, you should stop here first and look at this amazing blog post) in a few lines of code, with all the options to customize the training (check out an example provided by HuggingFace for sentence classification).

The official tutorial only provides you a way to use it without the new trainer πŸ’ͺ class. I will show in this notebook a small example using AdapterHub πŸ€– doing the job for you by providing a multilabel head out-of-the box.

type notebook
multilabel-adapter Open In Colab
multilabel-transformer Open In Colab

You can also use the fast lane πŸš€ by importing the MultilabelTransformer provided in this repository.

from MultilabelTransformer import MultilabelRobertaForSequenceClassification

model = MultilabelRobertaForSequenceClassification.from_pretrained('roberta-base', num_labels=N)

Note: MultilabelTransformer currently supports MultilabelRobertaForSequenceClassification and MultilabelBertForSequenceClassification.

Happy Researching πŸ‘¨β€πŸ”¬!

About

Tutorial on multilabel classification w/ Huggingface πŸ€— and AdapterHub πŸ€–


Languages

Language:Python 100.0%