xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Home Page:https://inference.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BUG When I reasoning the model Qwen-VL-Chat-Int4 and Yi-VL-6B, the Model Engine cannot be found

okwinds opened this issue · comments

First, register the model as shown in the following screenshot.

image

Second, find the Qwen-VL-Chat-Int4 inference in the list of custom models, as shown in the following screenshot

image

Qwen-VL-Chat-Int4

image image

xinference, version 0.12.1

Additional explanation
Yi-VL-6B has the same issue.

image image

@okwinds provide us the full screenshot. Did you select vision for VL models in Model Abilities section on the UI?

@okwinds provide us the full screenshot. Did you select vision for VL models in Model Abilities section on the UI?

yep, Registered again

image

image

image

Do not choose Generate in Abilities.

Do not choose Generate in Abilities.不要选择“能力”中的“生成”。

tried again
it cannot work

"model_ability": [
    "vision",
    "chat"
],

This error cannot be reproduced in version 0.13.1, please try upgrading to the latest version. @okwinds

The version of Xinference that I updated.

now, xinference, version 0.13.2.

The same operation process, the same problems still exist.

json:

{
    "version": 1,
    "context_length": 20000,
    "model_name": "Yi-VL-6B",
    "model_lang": [
        "en",
        "zh"
    ],
    "model_ability": [
        "chat",
        "vision"
    ],
    "model_description": "/home/llm/yi/Yi-VL-6B",
    "model_family": "yi-vl-chat",
    "model_specs": [
        {
            "model_format": "pytorch",
            "model_size_in_billions": 6,
            "quantizations": [
                "none"
            ],
            "model_id": null,
            "model_hub": "huggingface",
            "model_uri": "/home/llm/yi/Yi-VL-6B",
            "model_revision": null
        }
    ],
    "prompt_style": {
        "style_name": "CHATML",
        "system_prompt": "",
        "roles": [
            "<|im_start|>user",
            "<|im_start|>assistant"
        ],
        "intra_message_sep": "<|im_end|>",
        "inter_message_sep": "",
        "stop": [
            "<|endoftext|>",
            "<|im_start|>",
            "<|im_end|>",
            "<|im_sep|>"
        ],
        "stop_token_ids": [
            2,
            6,
            7,
            8
        ]
    },
    "is_builtin": false
}

@amumu96 @qinxuye

插眼,我使用0.11.3和0.13.1都存在这个问题

0.13.2版本 问题还是存在的 如果之前部署过其他模型有config缓存,可以正常配置,如果没有缓存的引擎选择不了,下拉框为空
企业微信截图_7901af4f-795d-426b-b812-fdce57daa3ca

same question!

0.12.0的含金量还在上升,我通过版本降级解决了
pip install xinference==0.12.0

0.12.0的含金量还在上升,我通过版本降级解决了 pip install xinference==0.12.0
Pip install xinference==0.12.0 Is xinference [all] installed by default and how to install xinference [transformers]

主分支应该已经解决,本周发版后请尝试。使用镜像的话可以尝试拉 nightly-main 分支。