[Question]: Lora微调

Question

[Question]: Lora微调

Bit-sjw opened this issue a year ago · comments

XXX_Xuxian commented a year ago

Description

Lora微调，为什么2张V100 32G显存，跟4张V100的训练速度一样？ hostfile有修改成4,
而且2张GPU时，GPU利用率都是100%
4张也都是100%

Alternatives

No response

XXX_Xuxian · Answer 1 · Sat Sep 02 2023 18:25:03 GMT+0800 (China Standard Time)

trainer中唯一使用到num_gpus的地方是这里

但是运行Lora微调的脚本时，会发现并不能进入这个上图的if语句，y因为 not_call_launch=True

BAAI-OpenPlatform · Answer 2 · Tue Sep 05 2023 10:48:01 GMT+0800 (China Standard Time)

请问您说的训练速度是elapsed time per iteration (ms) 吗？
这个信息每次只统计了一张卡，不是全局的

MR_U · Answer 3 · Wed Jan 17 2024 15:02:52 GMT+0800 (China Standard Time)

hi @Bit-sjw 你有没有遇到 File "/home/yumengda/.local/lib/python3.9/site-packages/flagai/model/aquila2/aquila2_flash_attn_monkey_patch.py", line 10, in
from flash_attn.bert_padding import pad_input, unpad_input
ModuleNotFoundError: No module named 'flash_attn'

因为flash_attn不支持v100，有什么快速解决方案吗？