xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Home Page:https://inference.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] i pulled qwen2 awq 7b 4bit quantized model but it is giving me gibberish text which has no meaning

HakaishinShwet opened this issue · comments

20240617_13h15m57s_grim
How can i improve it can anyone please guide me on this? because i didnt faced same issue with minicpm2.5llama3 transformer based model which also i tested recently on xinference. I tried multiple time but same issue so was wondering if its fault from xinference side or model config or anything else.Tried with different params change too like temp,context size still faced same thing. from my ollama experience i faced similar issue with some gguf model and in those case template was major culprit behind it but here i didnt changed anything related to template so was wondering if here also that can be the cause? if yes then please guide on what template to use which can give better result which is acceptable