Can this model inference/quantize using llama-cpp?

Question

Can this model inference/quantize using llama-cpp?

Adawann opened this issue 10 months ago · comments

rt

Alwaysloser · Answer 1 · Thu Sep 28 2023 11:05:53 GMT+0800 (China Standard Time)

error loading model: create_tensor: tensor 'output_norm.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/Xwin-LM-70B-V0.1/ggml-model-f32.gguf'
main: error: unable to load model

cannot run infer :(

asfandsaleem · Answer 2 · Wed Nov 08 2023 19:57:54 GMT+0800 (China Standard Time)

this seems to be your specific implementation problem. I tried working with Xwin LMs and they work great. I just used LM studio

Louis · Answer 3 · Sat Nov 11 2023 07:41:13 GMT+0800 (China Standard Time)

@asfandsaleem how did you get it to work on lm studio? I tried using the Vicuna preset without success