ZhengxiangShi / DePT

[ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"

Home Page:http://arxiv.org/abs/2309.05173

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Runing Time

HungerPWAY opened this issue · comments

Hello! I am a newbie in NLP. Thank you very much for the inspiration from your outstanding work! I got a warning that the lower version might lead to slower running speed. I would like to know how long it takes to run a dataset, so that I can better choose and install the versions of software and hardware. Looking forward to your response!

Hi, thank you so much for your question.

The running time of each dataset/task depends on the size of the training data. For large datasets like MNLI, QQP, and QNLI, which have over 100,000 training examples, it often takes longer to train the model for optimal performance, potentially exceeding 10 hours on a single RTX 3090 GPU. This duration assumes no specific speed-enhancing training techniques are applied. Training times are shorter for datasets with fewer examples. In the few-shot learning setting, the training time is typically not very long, but we need to fine-tune the model on the source task before applying it to the target task, which introduces some additional training time.

In addition, we perform an exhaustive hyperparameter search for two learning rates, which introduces additional training costs. However, we will enjoy low memory and time costs during the inference.

Thank you for your help!