bug: Phi-2 support is not exposed via the API Server CLI
ChristianWeyer opened this issue · comments
Summary
Phi-2 prompt template is implemented internally:
https://github.com/second-state/LlamaEdge/blob/6eed9d5b25133e623f643e212c4a672bd2c769e6/api-server/chat-prompts/src/lib.rs#L55
But it is not exposed through the CLI:
https://github.com/second-state/LlamaEdge/blob/6eed9d5b25133e623f643e212c4a672bd2c769e6/api-server/llama-api-server/src/main.rs#L166
Therefore we get an error when trying to run Phi-2:
wasmedge --dir .:. --nn-preload default:GGML:AUTO:phi-2-Q6_K.gguf llama-api-server.wasm --prompt-template phi-2-instruct --model-name phi-2 --socket-addr 127.0.0.1:8080 --log-prompts --log-stat
error: invalid value 'phi-2-instruct' for '--prompt-template <TEMPLATE>'
[possible values: llama-2-chat, codellama-instruct, codellama-super-instruct, mistral-instruct-v0.1, mistral-instruct, mistrallite, openchat, human-assistant, vicuna-1.0-chat, vicuna-1.1-chat, chatml, baichuan-2, wizard-coder, zephyr, stablelm-zephyr, intel-neural, deepseek-chat, deepseek-coder, solar-instruct]
tip: a similar value exists: 'mistral-instruct'
For more information, try '--help'.
Reproduction steps
wasmedge --dir .:. --nn-preload default:GGML:AUTO:phi-2-Q6_K.gguf llama-api-server.wasm --prompt-template phi-2-instruct --model-name phi-2 --socket-addr 127.0.0.1:8080 --log-prompts --log-stat
Screenshots
Any logs you want to share for showing the specific issue
No response
Model Information
phi-2-Q6_K.gguf
Operating system information
macOS 14.2.1
ARCH
arm64
CPU Information
Apple M1 Max
Memory Size
64GB
GPU Information
Apple M1 Max
VRAM Size
Apple M1 Max
@ChristianWeyer Thanks for your report. As you found, we implemented phi-2-chat
prompt type. And also, we used it to evaluate Phi-2. According to our observations on the chat with Phi-2, we found that, different from the normal chat models that usually have two roles: User and Assistant, Phi-2 has multiple roles: Alice, Bob, Charlie, ..., and of course User. For now, we cannot support such a chat model with multiple roles. This is why we do not expose the phi-2-chat
prompt type to our users. Thanks a lot!
Ah, got it - thanks @apepkuss. Basically, I was confused by the release notes then, which states that Phi-2 is supported.
You can try the instruct mode with the phi-2-instruct
prompt type if you would like to try Phi-2.
As you can see above in my report, this is what I am trying to do... :-)
@ChristianWeyer You have to use llama-chat.wasm if you'd like to run phi-2 with the phi-2-instruct
prompt type. You can refer to the command mentioned in second-state/phi-2-GGUF to run it.
OK, got that. However, I need an OpenAI API-compatible endpoint for my use cases.
Ok, I see. I'll check the possibility of supporting it in api-server. If it works, I'll feedback to you here.
Ok, I see. I'll check the possibility of supporting it in api-server. If it works, I'll feedback to you here.
What is the current state of this issue? Thanks!
Ok, I see. I'll check the possibility of supporting it in api-server. If it works, I'll feedback to you here.
Has this been implemented in the meantime?
Thanks!