Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adapter finetuning do not run on two cards (A100 40G)

wasifferoze opened this issue · comments

I am try to finetune on two cards but the adapter.py hang in there after this output, I waited so long (1 hour, went for food). what is the reason?

from jsonargparse.cli import CLI
[2023-10-02 11:27:18,651] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
initializing deepspeed distributed: GLOBAL_RANK: 1, MEMBER: 2/2
Enabling DeepSpeed BF16. Model parameters and inputs will be cast to bfloat16.
[rank: 0] Seed set to 1337
[rank: 1] Seed set to 1338
Number of trainable parameters: 1229760
Number of trainable parameters: 1229760