hananshafi / MTL-ViT

A new multi-task learning framework using Vision Transformers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MTL-ViT: A new multi-task learning framework using Vision Transformers

(*Note: This is an ongoing project, hence the full code and strategy is not yet open-sourced by the author.)

We presnet a new multi-task learning strategy using Vision transformers (ViTs). Our approach is based on exploiting the class-token and self-attention mechanism of Vision Transformers in order to train multiple tasks through a single ViT, more efficiently and with limited computational budget.

alt text

Total Loss of the Multi-task system:

About

A new multi-task learning framework using Vision Transformers


Languages

Language:Python 100.0%