mzbac / mlx-llm-server

For inferring and serving local LLMs using the MLX framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

issue about safetensors missing

vtboyarc opened this issue · comments

commented

I am using this model: https://huggingface.co/mlx-community/Llama-2-7b-chat-mlx

I have download the config, tokenizer, and weights, stored them in a folder on my machine, and point to that location for the --model-path. But I get the below error about no safetensors found:

ERROR:root:No safetensors found in mlx-community:Llama-2-7b-chat-mlx Traceback (most recent call last): File "/opt/homebrew/bin/mlx-llm-server", line 8, in <module> sys.exit(main()) ^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/mlx_llm_server/__main__.py", line 21, in main app = create_app(args.model_path, args.adapter_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/mlx_llm_server/app.py", line 128, in create_app model, tokenizer = load(model_path, adapter_file=adapter_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/mlx_lm/utils.py", line 279, in load model = load_model(model_path) ^^^^^^^^^^^^^^^^^^^^^^ File "/opt/homebrew/lib/python3.11/site-packages/mlx_lm/utils.py", line 216, in load_model raise FileNotFoundError(f"No safetensors found in {model_path}")

I can't find any mlx models on HF that have a safetensor file, anyways. I tried a few others just to rule the model out.

That model is the old mlx model which is no longer supported by mlx-lm. The mlx-lm can now directly load the Hugging Face models. You can try using https://huggingface.co/NousResearch/Llama-2-7b-chat-hf

mlx-llm-server --model-path NousResearch/Llama-2-7b-chat-hf
commented

ahh good to know, that one works, I figured any of the MLX community models would work. Thank you!