[Feature] Support old MPT GGUF conversions with duplicated output tensor

Question

dlippold opened this issue a month ago · comments

The fine-tuned MPT model from https://huggingface.co/maddes8cht/mosaicml-mpt-7b-instruct-gguf/ in quantization Q4_1 was usabel in release 2.7.2 but not longer in 2.7.3 and later. In particular it is currently not usable.

When I try to load the model file I get the following error message:

Could not load model due to invalid model file for mosaicml-mpt-7b-instruct-Q4_1.gguf

The reason of the problem may have to do with #2006

The model file should be loaded.

Jared Van Bortel · Answer 1 · Fri May 10 2024 06:35:06 GMT+0800 (China Standard Time)

I fixed this upstream in ggerganov/llama.cpp#6139 which should make it into the next release of GPT4All (already included in #2310).