Multi-Task-Deep-Learning

A list of papers, codes and applications on multi-task deep learning. Comments and contributions are welcomed!

And it's updating...

Papers

Survey

[1997] Caruana, R. Multitask Learning. Machine Learning 28, 41–75 (1997). https://doi.org/10.1023/A:1007379606734.
http://www.siam.org/meetings/sdm12/zhou_chen_ye.pdf Multi-Task Learning: Theory, Algorithms, and Applications (2012, SDM tutorial)
A Survey on Multi-Task Learning. arXiv, jul 2017.
An Overview of Multi-Task Learning in Deep Neural Networks. arXiv, jun 2017.
A brief review on multi-task learning. Multimedia Tools and Applications, 77(22):29705–29725, nov 2018.
Multi-task learning for dense prediction tasks: A survey. arXiv, apr 2020.
A Brief Review of Deep Multi-task Learning and Auxiliary Task Learning. arXiv, jul 2020.
Multi-task learning with deep neural networks: A survey, 2020.
[arXiv 2022] Multi-Task Learning for Visual Scene Understanding. paper
- PhD Thesis.
[arXiv 2022] A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods. paper

Theory

[2019] How to study the neural mechanisms of multiple tasks, paper
[ICLR 2021] Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach, https://openreview.net/forum?id=Cri3xz59ga
[ICML 2021] Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation, paper, code
[bioRxiv 2021] Abstract representations emerge naturally in neural networks trained to perform multiple tasks, paper
[NeurIPS 2023] Revisiting Scalarization in Multi-Task Learning: A Theoretical Perspective. paper

Architecture design

pure hard parameter sharing

[ICCV 2017] Multi-task Self-Supervised Visual Learning. paper
MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving. In IEEE Intelligent Vehicles Symposium, Proceedings, 2018.
Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 7482–7491, 2018.
UberNet: Training a universal convolutional neural network for Low-, Mid-, and high-level vision using diverse datasets and limited memory. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-January:5454–5463, 2017.
Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’18, page 1930–1939, New York, NY, USA, 2018. Association for Computing Machinery.
[CVPR 2018] PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning, Paper, Code
[ECCV 2018] Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights, Paper, Code
- learn to mask weights of an existing network
[ICRA 2019] Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations. paper
- Using aysmmetric datasets with uneven numbers of annotations with knowledge distillation (under the assumption of a powerful teacher network).
[CCN 2019] Modulation of early visual processing alleviates capacity limits in solving multiple tasks, Paper
- By associating neural modulations with task-based switching of the state of the network and characterizing when such switching is helpful in early processing, our results provide a functional perspective towards understanding why task-based modulation of early neural processes might be observed in the primate visual cortex.
[CVPR 2019] Attentive Single-Tasking of Multiple Tasks, paper, code
- We refine features with a task-specific residual adapter branch (RA) and attend to particular channels with task-specific Squeeze-and-Excitation (SE) modulation.
- We also enforce the task gradients to be statistically indistinguishable through adversarial training.
[AAAI 2020] Learning Sparse Sharing Architectures for Multiple Tasks, paper
[ICML 2020] Learning to Branch for Multi-Task Learning, http://proceedings.mlr.press/v119/guo20e.html.
[arXiv 2021] UniT: Multimodal Multitask Learning with a Unified Transformer, https://arxiv.org/abs/2102.10772, Code
[arXiv 2021] You Only Learn One Representation: Unified Network for Multiple Tasks, paper, Code
[arXiv 2021] Spatio-Temporal Multi-Task Learning Transformer for Joint Moving Object Detection and Segmentation. paper
[NeurIPS 2021] MTL-TransMODS: Cascaded Multi-Task Learning for Moving Object Detection and Segmentation with Unified Transformers. paper
[NeurIPS 2021] SOLQ: Segmenting Objects by Learning Queries. paper, code
- Based on but outperformed DETR (on detection) with learned unified queries for instance class, location and mask.
- Mask branch is supervised with DCT-compressed representation.
[CVPR 2021] CompositeTasking: Understanding Images by Spatial Composition of Tasks, paper, Code: One network for multiple tasks, but requires multiple inferences.
[ICCV 2021] Multi-Task Self-Training for Learning General Representations, paper
- Multi-task self-training with pseudo labels (generated by multiple single-task teachers)
- Cross training on multiple vision datasets
[ICCV 2021] MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach, paper
[arXiv 2021] Avoiding Catastrophe: Active Dendrites Enable Multi-Task Learning in Dynamic Environments. paper
- Sparse representation.
[ECCV 2022] Inverted Pyramid Multi-task Transformer for Dense Scene Understanding. paper. code
[arXiv 2022] Multitask Emotion Recognition Model with Knowledge Distillation and Task Discriminator. paper
- Multi-task model with gradient reversal layer and task disciminator.
[arXiv 2022] M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Bird’s-Eye View Representation. paper. project
- Comparison with LSS: 2D to 3D transformation without estimating depth. (Each pixel in 2D feature map is mapped to a set of points in the camera ray in 3D space).
- Multi-tasking 3D detection and BEV segmentation causes slight drop in performance.
[ICME 2022] Rethinking Hard-Parameter Sharing in Multi-Domain Learning. paper
[NeurIPS 2022] Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving. paper
- A LV-Adapter incorporates language priors in the multi-task model via task-specific prompting and alignment between visual and textual features.
[arXiv 2023] A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision. paper
[arXiv 2023] InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding. paper. code
[arXiv 2023] Prompt Guided Transformer for Multi-Task Dense Prediction. paper
[CVPR 2023] Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing with Non-Learnable Primitives. paper. code.
[ICCV 2023] Vision Transformer Adapters for Generalizable Multitask Learning. paper
[arXiv 2024] Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning. paper

pure soft parameter sharing

Cross-Stitch Networks for Multi-task Learning. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016.
Deep Multi-task Representation Learning: A Tensor Factorisation Approach. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, may 2016. Code
[AAAI 2019] Latent multi-task architecture learning. paper, Code
NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3200–3209. IEEE, jun 2019. Code

a mix of hard and soft

End-to-end multi-task learning with attention. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June:1871–1880, 2019. Code
Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 675–684, 2018.
Pattern-affinitive propagation across depth, surface normal and semantic segmentation. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4101–4110, 2019.
Mti-net: Multi-scale task interaction networks for multi-task learning. In ECCV, 2020. Code
Attentive Single-Tasking of Multiple Tasks. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1851–1860. IEEE, jun 2019. Code
Many Task Learning With Task Routing. In2019IEEE/CVF International Conference on Computer Vision (ICCV), pages 1375–1384. IEEE, oct 2019.
[CVPR 2022] Task Adaptive Parameter Sharing for Multi-Task Learning. paper
- Differentiable task-specific parameters as perturbation of the base network.
[arXiv 2022] Cross-task Attention Mechanism for Dense Multi-task Learning. paper
[AAAI 2023] DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction. paper. Code
[arXiv 2023] Deformable Mixer Transformer with Gating for Multi-Task Learning of Dense Prediction. paper. code
- Leverages the combination of both merits of deformable CNN and query-based Transformer for multi-task learning of dense prediction.
[ICLR 2023] TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding. paper. Code.
- a novel spatial-channel multi-task prompting transformer framework.
[CVPR 2023] Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving. paper
[arXiv 2023] A Dynamic Feature Interaction Framework for Multi-task Visual Perception. paper
[ICCV 2023] Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving. paper
- VTDNet groups similar tasks and employs task interaction stages to exchange information within and between task groups
- a Curriculum training, Pseudo-labeling, and Fine-tuning (CPF) scheme
[arXiv 2023] Rethinking of Feature Interaction for Multi-task Learning on Dense Prediction. paper

Architecture Search

[NeurIPS 2020] Adashare: Learning what to share for efficient deep multi-task learning. ArXiv, abs/1911.12423, 2020. Code
[CVPR 2020] Mtl-nas: Task-agnostic neural architecture search towards general-purpose multi-task learning. Code
[arXiv 2021] AutoMTL: A Programming Framework for Automated Multi-Task Learning. paper. Code
[arXiv 2021] FBNetV5: Neural Architecture Search for Multiple Tasks in One Run. paper
[arXiv 2023] AutoTaskFormer: Searching Vision Transformers for Multi-task Learning. paper

Dynamic Architecture

[CVPR 2022] Controllable Dynamic Multi-Task Architectures. paper. project
[arXiv 2022] An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems. paper muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems. paper
- Andrea Gesmundo, Jeff Dean
- A ViT-L architecture (307M params) was evolved into a multitask system with 13087M params jointly solving 69 tasks.
[NeurIPS 2022] M3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design. paper, Code
- at training, it disentangles the parameter spaces to avoid different tasks’ training conflicts.
- at inference, it allows for activating only the task-corresponding sparse “expert” pathway, instead of the full model
[arXiv 2022] Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners. paper
- we incorporate mixture of experts (MoE) layers into a transformer model, with a new loss that incorporates the mutual dependence between tasks and experts. This prevents the sharing of the entire backbone model between all tasks, which strengthens the model, especially when the training set size and the number of tasks scale up.
[ICLR 2023] Recon: Reducing Conflicting Gradients From the Root For Multi-Task Learning. paper
- Investigate the task gradients w.r.t. each shared network layer, select the layers with high conflict scores, and set them task-specific.
[arXiv 2023] DynaShare: Task and Instance Conditioned Parameter Sharing for Multi-Task Learning. paper
[ICCV 2023] Efficient Controllable Multi-Task Architectures. paper
[arXiv 2024] Merging Multi-Task Models via Weight-Ensembling Mixture of Experts. paper. code

Probabilistic MTL

[NeurIPS 2021] Variational Multi-Task Learning with Gumbel-Softmax Priors. paper

Task relationship learning

[CVPR 2018] Taskonomy: Disentangling Task Transfer Learning.
[CVPR 2017] Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification.
[arXiv 2020] Branched multi-task networks: Deciding what layers to share.
[arXiv 2020] Automated Search for Resource-Efficient Branched Multi-Task Networks.
[ICML 2020] Learning to Branch for Multi-Task Learning. paper
[arXiv 2020] Measuring and harnessing transference in multi-task learning, paper
[ICML 2020] Which Tasks Should Be Learned Together in Multi-task Learning?, http://proceedings.mlr.press/v119/standley20a.html, Code
[ICLR 2021] AUXILIARY TASK UPDATE DECOMPOSITION: THE GOOD, THE BAD AND THE NEUTRAL, https://openreview.net/forum?id=1GTma8HwlYp
- decompose auxiliary updates into directions which help, damage or leave the primary task loss unchanged
[NeurIPS 2021] Efficiently Identifying Task Groupings for Multi-Task Learning, paper, code
- Our method determines task groupings in a single run by training all tasks together and quantifying the effect to which one task's gradient would affect another task's loss.
- based on the idea and concepts in Measuring and harnessing transference in multi-task learning.
[arXiv 2022] Editing Models with Task Arithmetic. paper
[arXiv 2023] AdaMerging: Adaptive Model Merging for Multi-Task Learning. paper
[arXiv 2024] Representation Surgery for Multi-Task Model Merging. paper

Optimization Methods

Loss function

Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 7482–7491, 2018.
Auxiliary Tasks in Multi-task Learning. arXiv, may 2018.
GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks. 35th International Conference on Machine Learning, ICML 2018, 2:1240–1251, 2018.
Self-paced multi-task learning. AAAI Conference on Artificial Intelligence, pages 2175–2181, 2017.
Dynamic task prioritization for multitask learning. ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018.
Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, 2017.
End-to-end multi-task learning with attention. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June:1871–1880, 2019. Code
MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), volume 2019-June, pages 1200–1210. IEEE, jun 2019.
Dynamic Task Weighting Methods for Multi-task Networks in Autonomous Driving Systems. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), pages 1–8. IEEE, sep 2020.
A Comparison of Loss Weighting Strategies for Multi task Learning in Deep Neural Networks. IEEE Access, 7:141627–141632, 2019.
[ICRA 2021] OmniDet Surround View Cameras Based Multi-Task Visual Perception Network for Autonomous Driving, paper
[CVPR 2021] Taskology: Utilizing Task Relations at Scale, paper
[arXiv 2021] A Closer Look at Loss Weighting in Multi-Task Learning. paper
- MTL model with random weights sampled from a distribution.
[arXiv 2022] In Defense of the Unitary Scalarization for Deep Multi-Task Learning. paper
- None of the ad-hoc multi-task optimization algorithms consistently outperform unitary scalarization, where training simply minimizes the sum of the task losses.
[ICLR 2022] Weighted Training for Cross-Task Learning, paper
- Target-Aware Weighted Training (TAWT) minimizes a representation-based task distance between the source and target tasks.
[arXiv 2022] Auto-Lambda: Disentangling Dynamic Task Relationships. paper, Code
[arXiv 2022] Universal Representations: A Unified Look at Multiple Task and Domain Learning. paper, Code
- Distill knowledge from single-task networks.
[arXiv 2023] Sample-Level Weighting for Multi-Task Learning with Auxiliary Tasks. paper
[arXiv 2024] CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning. paper
- two key dimensions for task balancing: (1) Inter-Task Contribution, the phenomenon where learning one task potentially enhances the performance in other tasks, attributable to the overlapping knowledge domains, and (2) Intra-Task Difficulty, which refers to the learning difficulty within a single task.

Optimization

[Comptes Rendus Mathematique 2012] Multiple-gradient descent algorithm (MGDA) for multiobjective optimization. paper
[NeurIPS 2018] Multi-task learning as multi-objective optimization, paper
- MGDA-UB
[ICML 2018] Deep asymmetric multi-task feature learning.
[NeurIPS 2019] Pareto multi-task learning. paper. Code
[NeurIPS 2020] Gradient Surgery for Multi-Task Learning. paper, Code
[NeurIPS 2020] Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout. paper
[ICML 2020] Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization, paper, Code
[ICML 2020] Efficient Continuous Pareto Exploration in Multi-Task Learning, paper, Code
[ICML 2020] Adaptive Adversarial Multi-task Representation Learning, paper
[AAAI 2021] Task uncertainty loss reduce negative transfer in asymmetric multi-task feature learning. paper
[AISTATS 2021] High-Dimensional Multi-Task Averaging and Application to Kernel Mean Embedding, paper
[ICLR 2021] Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models, paper
[ICLR 2021] Towards Impartial Multi-task Learning, paper
[NeurIPS 2021] Conflict-Averse Gradient Descent for Multi-task learning, paper
[NeurIPS 2021] Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent, paper, code
[arXiv 2022] Multi-Task Learning as a Bargaining Game, paper, code
[ICLR 2022] Relational Multi-Task Learning: Modeling Relations between Data and Tasks, paper
- The proposed MetaLink reinterprets the last layer’s weights of each task as task nodes and creates a knowledge graph where data points and tasks are nodes and labeled edges provide information about labels of data points on tasks.
[ICLR 2022] RotoGrad: Gradient Homogenization in Multitask Learning, paper, code
- introduced a rotation layer between the shared backbone and task-specific branches to align gradient directions.
[ICLR 2022] Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning, paper
[arXiv 2022] On Steering Multi-Annotations per Sample for Multi-Task Learning. paper
- Each sample is randomly allocated a subset of tasks during training. (Imo. Can be regarded as a special case of A Closer Look at Loss Weighting in Multi-Task Learning.)
[arXiv 2022] Leveraging convergence behavior to balance conflicting tasks in multi-task learning. paper
- Proposed a method that takes into account temporal behaviour of the gradients to create a dynamic bias that adjust the importance of each task during the backpropagation.
[arxiv 2022] Do Current Multi-Task Optimization Methods in Deep Learning Even Help? paper
- Despite the added design and computational complexity of these algorithms, MTO methods do not yield any performance improvements beyond what is achievable via traditional optimization approaches.
[NeurIPS 2022] On the Convergence of Stochastic Multi-Objective Gradient Manipulation and Beyond. paper
- The stochastic gradient manipulation algorithms may fail to converge to Pareto optimal solutions and the authors fix it with an algorithm by averaging past weights.
[AAAI 2023] AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning. paper
- Separate the accumulative gradients and hence the learning rate of each task for each parameter in adaptive learning rate approaches.
[ICLR 2023] Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach. paper
- a stochastic multi-objective gradient correction (MoCo) method that can guarantee convergence without increasing the batch size even in the nonconvex setting.
[arXiv 2023] ForkMerge: Overcoming Negative Transfer in Multi-Task Learning. paper
[CVPR 2023] Independent Component Alignment for Multi-Task Learning. paper, code
[NeurIPS 2023] FAMO: Fast Adaptive Multitask Optimization. paper, code
- a dynamic weighting method that decreases task losses in a balanced way using O(1) space and time.
[arXiv 2023] FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration. paper
[arXiv 2023] A Scale-Invariant Task Balancing Approach for Multi-Task Learning. paper
[MICCAI 2023] Multi-Task Cooperative Learning via Searching for Flat Minima. paper
[arXiv 2023] Challenging Common Assumptions in Multi-task Learning. paper
[arXiv 2024] Quantifying Task Priority for Multi-Task Optimization. paper

Novel Settings

[CVPR 2019] Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks, paper
[ICML 2020] Task Understanding from Confusing Multi-task Data, paper
[ECCV 2020] Multitask Learning Strengthens Adversarial Robustness, paper, Code
[ICLR 2021] The Traveling Observer Model: Multi-task Learning Through Spatial Variable Embeddings, paper
- a machine learning framework in which seemingly unrelated tasks can be solved by a single model, by embedding their input and output variables into a shared space.
[arXiv 2021] Learning Multiple Dense Prediction Tasks from Partially Annotated Data, paper
[ICLR 2022] Multi-Task Neural Processes, paper
[DAC 2022] MIME: Adapting a Single Neural Network for Multi-task Inference with Memory-efficient Dynamic Pruning. paper
- MIME results in highly memory-efficient DRAM storage of neural-network parameters for multiple tasks compared to conventional multi-task inference.
[T-PAMI 2023] Performance-aware Approximation of Global Channel Pruning for Multitask CNNs, paper, Code
[arXiv 2023] Efficient Computation Sharing for Multi-Task Visual Scene Understanding, paper
[CVPR 2023] AdaMTL: Adaptive Input-dependent Inference for Efficient Multi-Task Learning. paper, Code.
[arXiv 2023] Label Budget Allocation in Multi-Task Learning. paper.
[BMVC 2023] Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data. paper.
[ICCV 2023] Multi-task View Synthesis with Neural Radiance Fields. paper. project. Code
- we present a novel problem setting -- multi-task view synthesis (MTVS), which reinterprets multi-task prediction as a set of novel-view synthesis tasks for multiple scene properties, including RGB.
[arXiv 2024] Robust Analysis of Multi-Task Learning on a Complex Vision System. paper
[CVPR 2024] DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data. paper. Code

Also I need to mention that many MTL approaches utilize not just one category of methods listed above but a combination instead.

Datasets

Commonly used in computer vision:

Taskonomy is currently the largest dataset specifically designed for multi-task learning. It has about 4.5 million images of indoor scenes from 3D scans of about 600 buildings and every image has an annotation for all 26 tasks.
NYU v2 is a large-scale dataset for indoor scenes understanding, which contains a variety of computer vision tasks. There are in total 1449 densely labeled RGBD images, capturing 464 diverse indoor scenes, with 35,064 distinct objects from 894 different classes.
MultiMNIST is an MTL version of the MNIST dataset. It is formed by overlaying multiple handwrit- ten digit images together. One of these is placed at the top-left while the other at the bottom-right. The tasks are classifying simultaneously the digit on the top-left and on the bottom-right.
CelebA dataset contains 10,000 identities, each has 20 images, which result a total number of 200,000 images. Since CelebA is annotated with 40 face attributes and 5 key points, it can be used in MTL setting by considering each attribute as a distinct classification task.
Cityscapes dataset is established for semantic urban scene understanding. It is comprised of 5000 images with high quality pixel-level annotations as well as 20,000 additional images with coarse annotations. Tasks like semantic segmentation, instance segmentation and depth estimation are able to be trained together on Cityscapes.
MS-COCO is a widely used dataset in CV. It contains 382k images with a total of 2.5 million labeled instances spanning 91 objects types. It can be used for multiple tasks including image classification, detection and segmentation.
KITTI is by far the most famous and commonly used dataset for autonomous driving. It provides benchmarks for multiple driving tasks: e.g. stereo matching, optical flow estimation, visual odometry/SLAM, semantic segmentation, object detection/orientation estimation and object tracking.
BDD100K is a recent driving dataset designed for heterogeneous multitask learning. It is comprised of 100K video clips and 10 tasks: image tagging, lane detection, drivable area segmentation, road object detection, semantic segmentation, instance segmentation, multi-object detection tracking, multi-object segmentation tracking, domain adaptation and imitation learning.
WoodScape: A Multi-Task, Multi-Camera Fisheye Dataset for Autonomous Driving. ICCV 2019. paper, code
TransNAS-Bench-101: CVPR 2021, Improving Transferability and Generalizability of Cross-Task Neural Architecture Search. paper.
Omnidata: ICCV 2021. Generating multi-task mid-level vision datasets from 3D Scans. paper.

Applications

Natural language processing

[ICML 2008] A unified architecture for natural language processing: deep neural networks with multitask learning, https://dl.acm.org/doi/10.1145/1390156.1390177
[ICLR 2021] Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data, https://openreview.net/pdf?id=de11dbHzAMF, Code
[ICLR 2021] HyperGrid Transformers: Towards A Single Model for Multiple Tasks, https://openreview.net/forum?id=hiq1rHO8pNT
[ICML 2020] XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation, http://proceedings.mlr.press/v119/hu20b.html, Code

Speech processing

Computer vision

[CVPR 2023 FMW] Workshop on Foundation Models.

Medical Imaging

[MIDL 2020] Extending Unsupervised Neural Image Compression With Supervised Multitask Learning, http://proceedings.mlr.press/v121/tellez20a.html

Autonomous Driving

[ICML 2019] Multi-task learning in the wildness. https://slideslive.com/38917690/multitask-learning-in-the-wilderness
[arXiv 2020] Efficient Latent Representations using Multiple Tasks for Autonomous Driving, paper
[arXiv 2021] MonoGRNet: A General Framework for Monocular 3D Object Detection, paper
[CVPR 2021] Multi-task Learning with Attention for End-to-end Autonomous Driving. paper
[ICRA 2021] OmniDet Surround View Cameras Based Multi-Task Visual Perception Network for Autonomous Driving, paper
[CVPR 2021] Deep Multi-Task Learning for Joint Localization, Perception, and Prediction. paper
[CVPR 2022 workshop] LidarMultiNet: Unifying LiDAR Semantic Segmentation, 3D Object Detection, and Panoptic Segmentation in a Single Multi-task Network. paper
- LidarMultiNet: Towards a Unified Multi-task Network for LiDAR Perception. paper
[arXiv 2022] BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving. paper. code
[arXiv 2022] Perceive, Interact, Predict: Learning Dynamic and Static Clues for End-to-End Motion Prediction. paper
- jointly and interactively performs online mapping, object detection and motion prediction.
[IEEE TIV 2022] Surround-view Fisheye BEV-Perception for Valet Parking: Dataset, Baseline and Distortion-insensitive Multi-task Framework. paper
[arXiv 2022] Goal-oriented Autonomous Driving. paper. project
- the first comprehensive framework up-to-date that incorporates full-stack driving tasks in one network.
[arXiv 2023] AOP-Net: All-in-One Perception Network for Joint LiDAR-based 3D Object Detection and Panoptic Segmentation. paper
[arXiv 2023] LiDARFormer: A Unified Transformer-based Multi-task Network for LiDAR Perception. paper
[CVPR 2023] TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving. paper. code.
[arXiv 2023] LiDAR-BEVMTN: Real-Time LiDAR Bird's-Eye View Multi-Task Perception Network for Autonomous Driving. paper.
[RA-L 2023] Multi-Modal Multi-Task (3MT) Road Segmentation. paper. code.
[CVPR-W] LeTFuser: Light-weight End-to-end Transformer-Based Sensor Fusion for Autonomous Driving with Multi-Task Learning. paper. code

Others

[SIGKDD 2019] Learning a Unified Embedding for Visual Search at Pinterest, paper
- For every mini-batch, we balance a uniform mix of each of the datasets with an epoch defined by the iterations to iterate through the largest dataset. Each dataset has its own indepedent tasks so we ignore the gradient contributions of images on tasks that it does not have data for. The losses from all the tasks are assigned equal weights and are summed for backward propagation.
[CVPR 2021] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework, paper, Code
[CVPR 2021] Three Birds with One Stone: Multi-Task Temporal Action Detection via Recycling Temporal Annotations, paper
[CVPR 2021] Anomaly Detection in Video via Self-Supervised and Multi-Task Learning, paper

Reinforcement learning

[arXiv 2021] MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale, https://arxiv.org/abs/2104.08212
[ICML 2020] CoMic: Complementary Task Learning & Mimicry for Reusable Skills, http://proceedings.mlr.press/v119/hasenclever20a.html
[AISTATS 2021] On the Effect of Auxiliary Tasks on Representation Dynamics, http://proceedings.mlr.press/v130/lyle21a.html
[ICML 2021] Multi-Task Reinforcement Learning with Context-based Representations, paper

Recommendation

[AISTATS 2021] Decision Making Problems with Funnel Structure: A Multi-Task Learning Approach with Application to Email Marketing Campaigns, http://proceedings.mlr.press/v130/xu21a.html

Multi-modal

[CVPR 2014] https://sites.google.com/site/deeplearningcvpr2014/DL-Multimodal_multitask_learning.pdf Multimodal learning and multitask learning
[arXiv 2020] Multimodal Continuous Emotion Recognition using Deep Multi-Task Learning with Correlation Loss, https://arxiv.org/abs/2011.00876
[arXiv 2021] Towards General Purpose Vision Systems, https://arxiv.org/pdf/2104.00743.pdf
[arXiv 2021] UniT: Multimodal Multitask Learning with a Unified Transformer, https://arxiv.org/abs/2102.10772, Code
[NeurIPS 2021] Revisit Multimodal Meta-Learning through the Lens of Multi-Task Learning, paper, code
- quantitative, task-level analysis inspired by the recent transference idea from multi-task learning
[NeurIPS 2021] Referring Transformer: A One-step Approach to Multi-task Visual Grounding, paper
[arXiv 2022] MultiMAE: Multi-modal Multi-task Masked Autoencoders. project, paper, code
[arXiv 2022] OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models. paper, Code
[ICLR 2023] UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks. paper

Related Areas

Transfer Learning
- [CVPR 2021] Can We Characterize Tasks Without Labels or Features? paper, code
- [CVPR 2021] OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations, paper
Auxiliary Learning
- [CVPR 2021] Image Change Captioning by Learning From an Auxiliary Task, paper
Multi-label Learning
- [AAAI 2023] Incomplete Multi-View Multi-Label Learning via Label-Guided Masked View- and Category-Aware Transformers. paper
Multi-modal Learning
Meta Learning
Continual Learning
- [CVPR 2021] KSM: Fast Multiple Task Adaption via Kernel-wise Soft Mask Learning, paper
- [ICLR 2021] Linear Mode Connectivity in Multitask and Continual Learning, https://openreview.net/forum?id=Fmg_fQYUejf, Code
- [arXiv 2023] SubTuning: Efficient Finetuning for Multi-Task Learning. paper
Curriculum Learning
Ensemble, Distillation and Model Fusion
Federal Learning
- [NeurIPS 2021] Federated Multi-Task Learning under a Mixture of Distributions, paper
- [arXiv 2022] Multi-Task Distributed Learning using Vision Transformer with Random Patch Permutation. paper
Active Learning
- [arXiv 2022] PartAL: Efficient Partial Active Learning in Multi-Task Visual Settings, paper

Trends

Pathways: multi-task, multi-modal, sparse activated

dingwoai / Multi-Task-Learning