bentoml / OpenLLM

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug: Not able to work with locally downloaded model

utsukprani opened this issue · comments

Describe the bug

i am new to OpenLLM.
i am trying to use OpenLLM with a locally downloaded model.

i downloaded the model NousResearch/Llama-2-7b-chat-hf from HuggingFace website.
i basically downloaded the file 'model-00001-of-00002.safetensors' into my local directory models under the project directory.

now i tried to run the following command:

openllm start opt --model-id ./models/model-00001-of-00002.safetensors

however when i do this i get the following error:

`(openllm) c:\Self Learning\knowledgeBot>openllm start opt --model-id ./models/model-00001-of-00002.safetensors
Passing 'openllm start opt --model-id ./models/model-00001-of-00002.safetensors' is deprecated and will be remove in a future version. Use 'openllm start ./models/model-00001-of-00002.safetensors' instead.
It is recommended to specify the backend explicitly. Cascading backend might lead to unexpected behaviour.
Traceback (most recent call last):
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\bentoml_internal\", line 109, in from_str
return cls(name, version)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\bentoml_internal\", line 63, in init
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\bentoml_internal\", line 40, in validate_tag_str
raise ValueError(
ValueError: \self learning\knowledgebot\models\model-00001-of-00002.safetensors is not a valid BentoML tag: a tag's name or version must be at most 63 characters in length, and a tag's name or version must consist of alphanumeric characters, '_', '-', or '.', and must start and end with an alphanumeric character

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Scripts\openllm.exe_main
.py", line 7, in
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\click\", line 1157, in call
return self.main(*args, **kwargs)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\click\", line 1078, in main
rv = self.invoke(ctx)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\click\", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\click\", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\click\", line 783, in invoke
return __callback(*args, **kwargs)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\openllm_cli\", line 160, in wrapper
return_value = func(*args, **attrs)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\click\", line 33, in new_func
return f(get_current_context(), *args, **kwargs)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\openllm_cli\", line 141, in wrapper
return f(*args, **attrs)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\openllm_cli\", line 366, in start_command
llm = openllm.LLM(
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\", line 203, in init
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\bentoml_internal\", line 96, in from_taglike
return cls.from_str(taglike)
File "C:\Users\dummy\AppData\Local\anaconda3\envs\openllm\Lib\site-packages\bentoml_internal\", line 111, in from_str
raise BentoMLException(f"Invalid {} {tag_str}")
bentoml.exceptions.BentoMLException: Invalid Tag pt-c:\self learning\knowledgebot\models\model-00001-of-00002.safetensors

(openllm) c:\Self Learning\knowledgeBot>`

Would appreciate if anyone can look into it and confirm if this is an Error ?

To reproduce

No response


No response


  • transformers version: 4.37.0
  • Platform: Windows-10-10.0.22621-SP0
  • Python version: 3.11.7
  • Huggingface_hub version: 0.20.3
  • Safetensors version: 0.4.1
  • Accelerate version: 0.26.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.1.2+cpu (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

System information (Optional)

No response


I have the same error.

Hi there, model_id is not supposed to be used like this. The file that you pass in here is a shard of the model, meaning we won't know how to load it at all.

You should just pass in openllm start /path/to/dir which contains the models and all of the required files to run the models.

Since 0.5 you should save your private model to the BentoML's model store, See for more information.

Then you can serve it directly with openllm start my-private-model

I will close this due to usage error.