Incompatible Matric Multiplication

ankitpdc opened this issue · comments

@RobinSmits I tried running the model polylm_13b_ft_alpaca_clean_dutch with same sample data, but getting the error related to incompatible matrix multiplication.
I want to check the model performance for dutch langauge. What changes would you suggest me?

Hi @ankitpdc
Thanks for reporting the issue. Not exactly sure at this moment what the root cause is.

I did notice that the base model at HuggingFace for PolyLM 13B was modified shortly after I trained and published my adapter model.

I suspect that it is related to this. I will test inference out myself in a few days and see if I get the same error based on the newer base model.

If so..than I will retrain the adapter model somewhere next week.

I will keep you updated.

I did some quick tests in a Kaggle Notebook.

The model there works without errors.

I suspect it is related to one of the Python library versions. Verify the versions of torch, peft, accelerate and bitsandbytes.
That should work.

Let me know if it solves the problem.

@RobinSmits Even with the same versions of torch, peft, accelerate and bitsandbytes, getting the same error. Trying to understand the reason for error. Thank you for updating the kaggle notebook with package versions.

Also, please let me know if you can find out any other reason for the error. I will add here as soon as I find the reason.

Update: Just checked, in the colab notebook, you checked with open_llama_13b_alpaca_clean_dutch_qlora, this model is working perfectly for me as well. Facing error for polylm-13b-inference (polylm_13b_ft_alpaca_clean_dutch) model.

@ankitpdc mistake ;-)
I've updated the notebook now and it is running now for the PolyLM 13B model. So will see the results tomorrow.

If the adapter model gives the same errors you experienced than it is likely because the base model was updated.

@ankitpdc The test run in the Kaggle notebook had exactly the same error as you posted.

I will retrain the adapter model somewhere in the coming week as this is related to the updated base model

@ankitpdc I've retrained the PolyLM 13B Adapter model and pushed it to Huggingface. The updated Training and Inference notebooks are also commited to github.
