mallorbc / Finetune_LLMs

Repo for fine-tuning Casual LLMs

mallorbc/Finetune_LLMs Issues

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 394.00 MiB
Updated 7 months ago3
Unable to find image 'gpt:latest' locally
Closed 7 months ago1
gradient overflow when training 13b Llama Model on 7 a100s
Updated 10 months ago1
Can't find a valid checkpoint
Closed 10 months ago1
Sends Kill to process when trying to resume a finetune on LLaMA 7B
Closed 10 months ago2
RuntimeError: Error building extension 'cpu_adam'
Closed 10 months ago5
RuntimeError: The expanded size of the tensor (50257) must match the existing size (0) at non-singleton dimension 0. Target sizes: [50257]. Tensor sizes: [0]
Closed 10 months ago2
`RuntimeError: Error building extension 'cpu_adam'AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Closed 10 months ago1
"nvcc fatal : Unsupported gpu architechture 'compute_89'" with docker image
Closed a year ago3
cannot import name 'GPTNeoXForCausalLM' from 'transformers'
Closed a year ago1
Running super slow on 4 a100 gpus
Closed a year ago2
How to make the inference of GPT-J run on multiple GPU ?
Closed 2 years ago2
DeepSpeedZeRoOffload initialize [end]
Closed 2 years ago4
[QUESTION] single_texts vs group_texts
Closed 2 years ago2
File: Dockerfile Line:32
Closed 2 years ago1
Incorrect block size?
Closed 2 years ago3
Training data format for generating Scenario based MCQ's
Closed 2 years ago2
Can't perform example_run, getting an error after deepspeed is initialized
Closed 3 years ago2
deepspeed>=0.5.7 is required by recent versions of the transformers package
Closed 3 years ago3
Error while running convert_model_to_torch script
Closed 3 years ago3