vLLM GROQ issue

Question

vLLM GROQ issue

chrissas opened this issue 2 months ago · comments

GROQ url mentioned in the documentation is wrong, I had confirmation from the groq team https://github.com/h2oai/h2ogpt/blob/main/docs/FAQ.md.

https://api.groq.com/openai:None:/v1: is deprecated and unused.

I tried their url but I still get this error
Exception: Error code: 404 - {'error': {'message': 'Unknown request URL: POST /openai/v1/completions:/completions. Please check the URL for typos, or see the docs at https://console.groq.com/docs/', 'type': 'invalid_request_error', 'code': 'unknown_url'}}

command :
python generate.py --model_lock="[{'inference_server':'vllm:https://api.groq.com/openai/v1/chat/completions:', 'base_model':'mixtral-8x7b-32768', 'max_seq_len': 31744, 'prompt_type':'plain'}]"

command :
python generate.py --model_lock="[{'inference_server':'vllm:https://api.groq.com/openai:None:/v1/chat:GroqAPIkey', 'base_model':'mixtral-8x7b-32768', 'max_seq_len': 31744, 'prompt_type':'plain'}]"

PSEUDOTENSOR / Jonathan McKinney · Answer 1 · Mon Apr 15 2024 10:23:22 GMT+0800 (China Standard Time)

Use the other way mentioned for groq a bit lower in the FAQ, i.e.