Error calling chat RTX as localhost
manassm opened this issue · comments
I am reading the docs over at Chat RTX and it doesn't seem that it is compatible with the OpenAI standard chat completions endpoints.
You might need to use another LLM server like ollama, vLLM, etc. or find a way to get the openai style chat endpoints for RTX
I am getting a similar error pattern using Local LLM with Ollama and I also tried with LM Studio. I can see that the servers are chatting but there is always an error regarding something not defined whether it be 'maths' 'sin' etc. there is always an excuse.
I was looking over the Ollama server logs and noticed this:
level=WARN source=server.go:230 msg="multimodal models don't support parallel requests yet"
what are your prompts?