hzg0601

Zhi-guo Huang's repositories

langchain-ChatGLM-annotation

对langchain-ChatGLM项目各模块进行注释，增加了一些新的特性，修复了一些bug

Language:PythonApache-2.02300

mii-dev

dev for deepspeed-mii

Language:PythonApache-2.01 10

chat-gpt-langchain-fork

fork from https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain

Language:Python010

debuged-Evolve-GCN

对evolve-gcn的源代码进行了debug

Language:PythonApache-2.0000

ds-chat-bloom

ds-chat 针对bloom进行了debug

Language:PythonApache-2.0000

lit-llama-cn-annotated

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Language:PythonApache-2.0000

peft-cn-annotated

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonApache-2.0000

qlora-zero-cn

qlora+zero算法，加速模型训练，降低显存要求； qlora各模块的中文注释

Language:Jupyter NotebookApache-2.0000

ZeQLoRA

ZeQLoRA: Efficient Finetuning of Quantized LLMs with ZeRO and LoRA

Language:Jupyter NotebookMIT000

cs-224w-cn

cs224w课程的中文笔记

Language:Jupyter Notebook000

DeepKE-fork

An Open Toolkit for Knowledge Graph Extraction and Construction published at EMNLP2022 System Demonstrations.

Language:PythonMIT000

Fast-Chatchat

FastChat中文注释见cn_annotation分支,新特性见readme

Language:PythonApache-2.0000

fastllm-fork

纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行

Language:C++000

GPTCache-dev

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Language:PythonMIT000

graphrag-fork

A modular graph-based Retrieval-Augmented Generation (RAG) system

MIT000

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Language:PythonApache-2.0000

LightChat

一个用于提供LLM服务的轻量级工具

Language:Python010

Megatron-LM-fork

Ongoing research training transformer models at scale

Language:PythonNOASSERTION000

TensorRT-LLM-dev

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Apache-2.0000

weaviate-abc

weaviate入门

Language:Python000

hzg0601

Zhi-guo Huang's repositories

LLM-Notes

langchain-ChatGLM-annotation

cn-llm-codes

speedai

CUDA-notes

finetune-embedding

mii-dev

qwen-trt-llm-notion

chat-gpt-langchain-fork

debuged-Evolve-GCN

ds-chat-bloom

hzg0601.github.io

lit-llama-cn-annotated

peft-cn-annotated

qlora-zero-cn

ZeQLoRA

CHFS_clean

clash-for-linux-backup

cs-224w-cn

DeepKE-fork

Fast-Chatchat

fastllm-fork

gnn-translation-books

GPTCache-dev

graphrag-fork

inference-dev

LightChat

Megatron-LM-fork

TensorRT-LLM-dev

weaviate-abc