MiniCPM-Llama3-V 2.5 int4 版本支持微调吗?
myBigbug opened this issue · comments
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
- 我已经搜索过FAQ | I have searched FAQ
当前行为 | Current Behavior
因为MiniCPM-Llama3-V 2.5 支持微调,但是显卡内存只有24GB,不够使用,所以MiniCPM-Llama3-V 2.5 int4支持微调吗?
目前我微调会得到报错ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details
期望行为 | Expected Behavior
No response
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
- OS:Centos7
- Python: 3.10
- Transformers:4.40.0
- PyTorch:2.1.2
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12.1
pip包
accelerate 0.30.1
addict 2.4.0
aiofiles 23.2.1
altair 5.3.0
annotated-types 0.7.0
anyio 4.4.0
attrs 23.2.0
bitsandbytes 0.42.0
blis 0.7.11
catalogue 2.0.10
certifi 2024.6.2
charset-normalizer 3.3.2
click 8.1.7
cloudpathlib 0.16.0
colorama 0.4.6
confection 0.1.5
contourpy 1.2.1
cycler 0.12.1
cymem 2.0.8
editdistance 0.6.2
einops 0.7.0
et-xmlfile 1.1.0
exceptiongroup 1.2.1
fairscale 0.4.0
fastapi 0.110.3
ffmpy 0.3.2
filelock 3.14.0
fonttools 4.53.0
fsspec 2024.6.0
gradio 4.26.0
gradio_client 0.15.1
h11 0.14.0
httpcore 1.0.5
httpx 0.27.0
huggingface-hub 0.23.3
idna 3.7
importlib_resources 6.4.0
Jinja2 3.1.4
joblib 1.4.2
jsonlines 4.0.0
jsonschema 4.22.0
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
langcodes 3.4.0
language_data 1.2.0
lxml 5.2.2
marisa-trie 1.2.0
markdown-it-py 3.0.0
markdown2 2.4.10
MarkupSafe 2.1.5
matplotlib 3.7.4
mdurl 0.1.2
more-itertools 10.1.0
mpmath 1.3.0
murmurhash 1.0.10
networkx 3.3
nltk 3.8.1
numpy 1.24.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.5.40
nvidia-nvtx-cu12 12.1.105
opencv-python-headless 4.5.5.64
openpyxl 3.1.2
orjson 3.10.4
packaging 23.2
pandas 2.2.2
Pillow 10.1.0
pip 24.0
portalocker 2.8.2
preshed 3.0.9
protobuf 4.25.0
psutil 5.9.8
pydantic 2.7.3
pydantic_core 2.18.4
pydub 0.25.1
Pygments 2.18.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
referencing 0.35.1
regex 2024.5.15
requests 2.32.3
rich 13.7.1
rpds-py 0.18.1
ruff 0.4.8
sacrebleu 2.3.2
safetensors 0.4.3
scipy 1.13.1
seaborn 0.13.0
semantic-version 2.10.0
sentencepiece 0.1.99
setuptools 70.0.0
shellingham 1.5.4
shortuuid 1.0.11
six 1.16.0
smart-open 6.4.0
sniffio 1.3.1
socksio 1.0.0
spacy 3.7.2
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.4.8
starlette 0.37.2
sympy 1.12.1
tabulate 0.9.0
thinc 8.2.4
timm 0.9.10
tokenizers 0.19.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.1.2
torchvision 0.16.2
tqdm 4.66.1
transformers 4.40.0
triton 2.1.0
typer 0.9.4
typing_extensions 4.8.0
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.24.0.post1
wasabi 1.1.3
weasel 0.3.4
websockets 11.0.3
wheel 0.43.0
备注 | Anything else?
No response
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
我也想知道是如何量化的,请问你得到了吗
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.
我也想知道是如何量化的,请问你得到了吗
No, I'm still waiting
可以啊
在finetune_.lora.sh
改成
MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4"
--tune_vision false
--deepspeed ds_config_zero3.json
就可以了
可以啊
在finetune_.lora.sh 改成 MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4"
--tune_vision false --deepspeed ds_config_zero3.json
就可以了
@nickyisadog
我是使用finetune_ds.sh脚本不是lora脚本微调int-4模型得到了报错,ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details 。 请帮忙分析这是什么原因导致的?
I am facing this error, RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)
I ran with these changes:
In finetune_.lora.sh,
change
MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4"
--tune_vision false
--deepspeed ds_config_zero3.json