johnsmith0031 / alpaca_lora_4bit

johnsmith0031/alpaca_lora_4bit Issues

Support for moe model？
Updated 4 months ago2
Why lora support is only for simple lora with only q_proj and v_proj ？
Updated 5 months ago1
trying to get this working with text-generation-webui
Updated 9 months ago9
Error attempting to finetune llama2-70b
Updated 10 months ago5
AttributeError: 'dict' object has no attribute 'to_dict'
Closed a year ago1
Finetuning CodeLLaMA34B - RuntimeError: The size of tensor a (1024) must match the size of tensor b (8192)
Closed a year ago3
docker.io/nvidia/cuda:11.7.0-devel-ubuntu22.04 not available anymore
Closed a year ago1
3 errors detected in the compilation of "src/alpaca_lora_4bit/quant_cuda/quant_cuda_kernel.cu"
Closed a year ago2
monkeypatch problem
Updated a year ago8
ValueError: Target module Autograd4bitQuantLinear() is not supported.
Closed a year ago7
Target module Autograd4bitQuantLinear() is not supported
Updated a year ago5
OOM on inference while i can finetune with more tokens
Closed a year ago2
module 'alpaca_lora_4bit.quant_cuda' has no attribute 'vecquant4recons_v2'
Closed a year ago4
Unable to Build Wheels
Closed a year ago8
Merging LoRA after finetune
Updated a year ago1
Targeting all layers and biases
Closed a year ago2
Checkpoint saving broken with the latest version of huggingface
Closed a year ago8
Feature request: Stop when loss reaches X
Updated a year ago1
Is alpaca_lora_4bit@winglian-setup_pip missing finetune.py?
Updated a year ago1
High perplexity while lower loss after LoRA finetuning (how?)
Closed a year ago5
LoRA Output Identical to Base Model
Closed a year ago4
Flash Attention 2
Closed a year ago1
How to use inference.py after finetune.py?
Closed a year ago2
TypeError: object of type 'NoneType' has no len()
Closed a year ago1
Gibberish results for non-disabled "faster_mode" using "vicuna-7B-GPTQ-4bit-128g" model
Updated a year ago4
July
Closed a year ago4
Crashes during finetuning
Updated a year ago2
Update docs for > 2048 token models (SuperHOT)?
Updated a year ago12
Differences between QLoRA and this repo
Updated a year ago2
Inf or NaN in probabilities. Windows 10, vicuna-7b-gptq-4bit-128g
Closed a year ago35
this repo support 2bit finetuning the llama model？ Is there any case to show how to run the scripts?
Updated a year ago1
[question] weights in the replaced quantized modules
Closed a year ago
how to change into 8 bit
Updated a year ago1
Problem with inference
Closed a year ago7
fine tune with 2 GPU
Updated a year ago2
Version of GPTQ
Updated a year ago2
how to infer with finetuned model?
Updated a year ago4
ImportError: cannot import name '_get_submodules' from 'peft.utils'
Closed a year ago10
Consider using new QLoRA
Updated a year ago3
Implementing Landmark Attention
Updated a year ago
Finetuning 2-bit Quantized Models
Updated a year ago7
When running with 2 GPUs, got multiple values for keyword argument 'backend'
Closed a year ago20
Code reference request
Updated a year ago1
Problem loading safetensor file format
Updated a year ago3
what is the difference between v1 model and v2 model?
Updated a year ago1
ValueError: Autograd4bitQuantLinear() does not have a parameter or a buffer named qzeros.
Updated a year ago2
error with monkeypatch and model gpt-j and lora
Updated a year ago
TypeError: '<' not supported between instances of 'tuple' and 'float' while trying to generate completion through the v2 13bit LLAMA
Closed a year ago6
Which script were used for 4bit quantization?
Updated a year ago2
run_server.sh: ValueError: Autograd4bitQuantLinear() does not have a parameter or a buffer named g_idx.
Updated a year ago1