alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KeyError: 'MODEL_TYPE'

York-RDWang opened this issue · comments

follow readme and start a server of meta-llama/Llama-2-7b-hf, get an error:

[root][03/05/2024 09:17:56][gang_server.py:start():181][INFO] world_size==1, do not start gang_server
[root][03/05/2024 09:17:56][util.py:copy_gemm_config():131][INFO] not found gemm_config in HIPPO_APP_INST_ROOT, not copy
[root][03/05/2024 09:17:56][inference_worker.py:__init__():23][INFO] starting InferenceWorker
[root][03/05/2024 09:17:56][start_server.py:local_rank_start():34][ERROR] start server error: 'MODEL_TYPE', trace: Traceback (most recent call last):
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/start_server.py", line 32, in local_rank_start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/server/inference_app.py", line 37, in start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/server/inference_server.py", line 57, in start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/server/inference_worker.py", line 27, in __init__
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/model_factory.py", line 172, in create_from_env
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/model_factory.py", line 85, in create_normal_model_config
  File "/home/ubuntu/anaconda3/envs/rtp_py3.10/lib/python3.10/os.py", line 680, in __getitem__
    raise KeyError(key) from None
KeyError: 'MODEL_TYPE'

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/rtp_py3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ubuntu/anaconda3/envs/rtp_py3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/start_server.py", line 82, in <module>
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/start_server.py", line 76, in main
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/start_server.py", line 35, in local_rank_start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/start_server.py", line 32, in local_rank_start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/server/inference_app.py", line 37, in start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/server/inference_server.py", line 57, in start
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/server/inference_worker.py", line 27, in __init__
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/model_factory.py", line 172, in create_from_env
  File "<maga_transformer-0.1.5+cuda121>/maga_transformer/model_factory.py", line 85, in create_normal_model_config
  File "/home/ubuntu/anaconda3/envs/rtp_py3.10/lib/python3.10/os.py", line 680, in __getitem__
    raise KeyError(key) from None
KeyError: 'MODEL_TYPE'
commented

boy, you can set it in your command, just sach as MODEL_TYPE=llama in front of your command.

Document:
MODEL_TYPE 目前支持 chatglmchat_glmchatglm2chat_glm_2chatglm3chat_glm_3glm_130bgpt_bigcodewizardcodersgpt_bloomsgpt_bloom_vectorbloomllamaxversellavabaichuangpt_neoxqwen_7bqwen_13bqwen_1b8qwen_vlfalconmptinternlmphi``aquila