voidful / TFkit

πŸ€–πŸ“‡ handling multiple nlp task in one pipeline

Home Page:https://voidful.github.io/TFkit/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool




PyPI Download Last Commit CodeFactor Visitor

What is it

TFKit is a tool kit mainly for language generation.
It leverages the use of transformers on many tasks with different models in this all-in-one framework.
All you need is a little change of config.

Task Supported

With transformer models - BERT/ALBERT/T5/BART......

Text Generation πŸ“ seq2seq language model
Text Generation πŸ–ŠοΈ causal language model
Text Generation πŸ–¨οΈ once generation model / once generation model with ctc loss
Text Generation πŸ“ onebyone generation model

Getting Started

Learn more from the document.

How To Use

Step 0: Install

Simple installation from PyPI

pip install git+https://github.com/voidful/TFkit.git@refactor-dataset

Step 1: Prepare dataset in csv format

Task format

input, target

Step 2: Train model

tfkit-train \
--task clas \
--config xlm-roberta-base \
--train training_data.csv \
--test testing_data.csv \
--lr 4e-5 \
--maxlen 384 \
--epoch 10 \
--savedir roberta_sentiment_classificer

Step 3: Evaluate

tfkit-eval \
--task roberta_sentiment_classificer/1.pt \
--metric clas \
--valid testing_data.csv

Advanced features

Multi-task training
tfkit-train \
  --task clas clas \
  --config xlm-roberta-base \
  --train training_data_taskA.csv training_data_taskB.csv \
  --test testing_data_taskA.csv testing_data_taskB.csv \
  --lr 4e-5 \
  --maxlen 384 \
  --epoch 10 \
  --savedir roberta_sentiment_classificer_multi_task

Not maintained task

Due to time constraints, the following tasks are temporarily not supported

Classification 🏷️ multi-class and multi-label classification
Question Answering πŸ“ƒ extractive qa
Question Answering πŸ”˜ multiple-choice qa
Tagging πŸ‘οΈβ€πŸ—¨οΈ sequence level tagging / sequence level with crf
Self-supervise Learning 🀿 mask language model

Supplement

Contributing

Thanks for your interest.There are many ways to contribute to this project. Get started here.

License PyPI - License

Icons reference

Icons modify from Freepik from www.flaticon.com
Icons modify from Nikita Golubev from www.flaticon.com

About

πŸ€–πŸ“‡ handling multiple nlp task in one pipeline

https://voidful.github.io/TFkit/

License:Apache License 2.0


Languages

Language:Python 99.9%Language:Dockerfile 0.1%