Zhi-guo Huang's repositories
langchain-ChatGLM-annotation
对langchain-ChatGLM项目各模块进行注释,增加了一些新的特性,修复了一些bug
cn-llm-codes
中文LLM的代码合集
chat-gpt-langchain-fork
fork from https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain
debuged-Evolve-GCN
对evolve-gcn的源代码进行了debug
ds-chat-bloom
ds-chat 针对bloom进行了debug
lit-llama-cn-annotated
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
peft-cn-annotated
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
qlora-zero-cn
qlora+zero算法,加速模型训练,降低显存要求; qlora各模块的中文注释
ZeQLoRA
ZeQLoRA: Efficient Finetuning of Quantized LLMs with ZeRO and LoRA
clash-for-linux-backup
clash for linux备份仓库
cs-224w-cn
cs224w课程的中文笔记
DeepKE-fork
An Open Toolkit for Knowledge Graph Extraction and Construction published at EMNLP2022 System Demonstrations.
Fast-Chatchat
FastChat中文注释见cn_annotation分支,新特性见readme
fastllm-fork
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
GPTCache-dev
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
graphrag-fork
A modular graph-based Retrieval-Augmented Generation (RAG) system
inference-dev
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Megatron-LM-fork
Ongoing research training transformer models at scale
TensorRT-LLM-dev
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
weaviate-abc
weaviate入门