Yifan-Song793 / ETO

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)

Home Page:https://arxiv.org/abs/2403.02502

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Performance with LoRA Finetuning

Yu-Fangxu opened this issue · comments

Hi,
Thanks for your wonderful work, I noticed that you fine-tuned LLMs with 8 A100 GPUs. Have you ever tried training with LoRA for less consumption of computational resources? Thanks~

Hi, Fangxu!
Thanks for the question! Our experiments were conducted in a fully-parameter fine-tuning setting. In fact, 4 A100 80G GPUs are enough for our 7B experiments, including SFT and DPO. To implement LoRA in your training, you will need to modify fastchat/train/train.py and fastchat/train/train_dpo.py. Maybe you can see fastchat/train/train_lora.py for the reference implementation of integrating LoRA with FastChat.