intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.

请适配GLM-4-9B模型

KiwiHana opened this issue 21 days ago · comments

KiwiHana commented 21 days ago

https://github.com/THUDM/GLM-4
https://huggingface.co/THUDM/glm-4-9b