Stick Cui's starred repositories
text-generation-webui
A Gradio web UI for Large Language Models.
LLaMA-Factory
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
ChatGLM2-6B
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
flash-attention
Fast and memory-efficient exact attention
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Baichuan-7B
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
long_llama
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
XVERSE-13B
XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.
tensorrtllm_backend
The Triton TensorRT-LLM Backend
WanJuan1.0
万卷1.0多模态语料