kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Home Page:https://kvcache-ai.github.io/ktransformers/

Repository from Github https://github.comkvcache-ai/ktransformersRepository from Github https://github.comkvcache-ai/ktransformers

[Bug] Missing required parameters of construction of `StaticCache`

YiD11 opened this issue · comments

commented

检查清单

  • 1. 我已经搜索过相关问题,但未能获得预期的帮助
  • 2. 该问题在最新版本中尚未修复
  • 3. 请注意,如果您提交的BUG相关 issue 缺少对应环境信息和最小可复现示例,我们将难以复现和定位问题,降低获得反馈的可能性
  • 4. 如果您提出的不是bug而是问题,请在讨论区发起讨论 https://github.com/kvcache-ai/ktransformers/discussions。否则该 issue 将被关闭
  • 5. 为方便社区交流,我将使用中文/英文或附上中文/英文翻译(如使用其他语言)。未附带翻译的非中文/英语内容可能会被关闭

问题描述 Problem Description

首轮对话的在输入后至回答前,报错:
Error reported from input to response in the first round of dialogue:

Exception has occurred: TypeError
Cache.__init__() missing 1 required positional argument: 'layer_classes'
  File "ktransformers/models/custom_cache.py", line 38, in __init__
    Cache.__init__(self)
    ~~~~~~~~~~~~~~^^^^^^
  File "ktransformers/util/utils.py", line 308, in prefill_and_generate
        config = model.config, max_batch_size = 1, max_cache_len = seq_length + max_new_tokens, device = device_map, dtype = model.dtype
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    
    )
    ^
  File "ktransformers/local_chat.py", line 191, in local_chat
        model, tokenizer, input_tensor.to(device), max_new_tokens, use_cuda_graph, mode = mode, force_think = force_think, chunk_size = chunk_size,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
    
    )
    ^
  File "ktransformers/local_chat.py", line 197, in <module>
    fire.Fire(local_chat)
    ~~~~~~~~~^^^^^^^^^^^^
TypeError: Cache.__init__() missing 1 required positional argument: 'layer_classes'

复现步骤

ktransformers/local_chat.py --model_path deepseek-ai/DeepSeek-V2-Lite-Chat --gguf_path my/path/to/DeepSeek-V2-Lite-Chat-GGUF

环境信息

pip list

accelerate                1.10.1
aiohappyeyeballs          2.6.1
aiohttp                   3.12.15
aiosignal                 1.4.0
annotated-types           0.7.0
anyio                     4.9.0
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 3.0.0
async-lru                 2.0.5
attrs                     25.3.0
babel                     2.17.0
beautifulsoup4            4.13.4
bleach                    6.2.0
blessed                   1.22.0
blobfile                  3.1.0
build                     1.3.0
certifi                   2025.4.26
cffi                      1.17.1
charset-normalizer        3.4.2
click                     8.3.0
colorlog                  6.9.0
comm                      0.2.2
contourpy                 1.3.3
cpufeature                0.2.1
cycler                    0.12.1
datasets                  4.1.1
debugpy                   1.8.14
decorator                 5.2.1
defusedxml                0.7.1
dill                      0.4.0
distro                    1.9.0
executing                 2.2.0
fastapi                   0.116.2
fastjsonschema            2.21.1
filelock                  3.19.1
fire                      0.7.1
flashinfer-python         0.2.3
fonttools                 4.60.0
fqdn                      1.5.1
frozenlist                1.7.0
fsspec                    2025.9.0
greenlet                  3.2.4
h11                       0.16.0
hf-xet                    1.1.10
httpcore                  1.0.9
httpx                     0.28.1
huggingface-hub           0.35.0
idna                      3.10
ipykernel                 6.30.1
ipython                   9.2.0
ipython_pygments_lexers   1.1.1
ipywidgets                8.1.7
isoduration               20.11.0
jedi                      0.19.2
Jinja2                    3.1.6
jiter                     0.11.0
joblib                    1.5.2
json5                     0.12.0
jsonpatch                 1.33
jsonpointer               3.0.0
jsonschema                4.23.0
jsonschema-specifications 2025.4.1
jupyter                   1.1.1
jupyter_client            8.6.3
jupyter-console           6.6.3
jupyter_core              5.7.2
jupyter-events            0.12.0
jupyter-lsp               2.2.5
jupyter_server            2.15.0
jupyter_server_terminals  0.5.3
jupyterlab                4.4.7
jupyterlab_pygments       0.3.0
jupyterlab_server         2.27.3
jupyterlab_widgets        3.0.15
kiwisolver                1.4.9
ktransformers             0.3.2+cu128torch28fancy /home/zhangyf/ktransformers
langchain                 0.3.27
langchain-core            0.3.76
langchain-text-splitters  0.3.11
langsmith                 0.4.29
lxml                      6.0.1
MarkupSafe                3.0.2
matplotlib                3.10.6
matplotlib-inline         0.1.7
mistune                   3.1.3
mpmath                    1.3.0
multidict                 6.6.4
multiprocess              0.70.16
nbclient                  0.10.2
nbconvert                 7.16.6
nbformat                  5.10.4
nest-asyncio              1.6.0
networkx                  3.5
ninja                     1.13.0
notebook                  7.4.2
notebook_shim             0.2.4
numpy                     2.3.3
nvidia-cublas-cu12        12.8.4.1
nvidia-cuda-cupti-cu12    12.8.90
nvidia-cuda-nvrtc-cu12    12.8.93
nvidia-cuda-runtime-cu12  12.8.90
nvidia-cudnn-cu12         9.10.2.21
nvidia-cufft-cu12         11.3.3.83
nvidia-cufile-cu12        1.13.1.3
nvidia-curand-cu12        10.3.9.90
nvidia-cusolver-cu12      11.7.3.90
nvidia-cusparse-cu12      12.5.8.93
nvidia-cusparselt-cu12    0.7.1
nvidia-nccl-cu12          2.27.3
nvidia-nvjitlink-cu12     12.8.93
nvidia-nvtx-cu12          12.8.90
openai                    1.108.1
orjson                    3.11.3
overrides                 7.7.0
packaging                 25.0
pandas                    2.3.2
pandocfilters             1.5.1
parso                     0.8.4
pexpect                   4.9.0
pillow                    11.3.0
pip                       25.1
platformdirs              4.3.7
prometheus_client         0.21.1
prompt_toolkit            3.0.51
propcache                 0.3.2
protobuf                  6.32.1
psutil                    7.0.0
ptyprocess                0.7.0
pure_eval                 0.2.3
pyarrow                   21.0.0
pybind11                  3.0.1
pycparser                 2.22
pycryptodomex             3.23.0
pydantic                  2.11.9
pydantic_core             2.33.2
Pygments                  2.19.1
pyparsing                 3.2.4
pyproject_hooks           1.2.0
python-dateutil           2.9.0.post0
python-json-logger        3.3.0
pytz                      2025.2
PyYAML                    6.0.2
pyzmq                     26.4.0
referencing               0.36.2
regex                     2025.9.18
requests                  2.32.3
requests-toolbelt         1.0.0
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rpds-py                   0.24.0
safetensors               0.6.2
scikit-learn              1.7.2
scipy                     1.16.2
Send2Trash                1.8.3
sentencepiece             0.2.1
setuptools                78.1.1
six                       1.17.0
sniffio                   1.3.1
soupsieve                 2.7
SQLAlchemy                2.0.43
stack-data                0.6.3
starlette                 0.48.0
sympy                     1.14.0
tenacity                  9.1.2
termcolor                 3.1.0
terminado                 0.18.1
threadpoolctl             3.6.0
tiktoken                  0.11.0
tinycss2                  1.4.0
tokenizers                0.21.4
torch                     2.8.0
tornado                   6.4.2
tqdm                      4.67.1
traitlets                 5.14.3
transformers              4.54.0
triton                    3.4.0
types-python-dateutil     2.9.0.20241206
typing_extensions         4.13.2
typing-inspection         0.4.1
tzdata                    2025.2
uri-template              1.3.0
urllib3                   2.4.0
uvicorn                   0.36.0
wcwidth                   0.2.13
webcolors                 24.11.1
webencodings              0.5.1
websocket-client          1.8.0
wheel                     0.45.1
widgetsnbextension        4.0.14
xxhash                    3.5.0
yarl                      1.20.1
zmq                       0.0.0
zstandard                 0.25.0

GPU: V100
CUDA Version: 12.9

the same issue there.

the same issue here.