ChrisGao001's repositories

ChatGLM-6B

ChatGLM-6B:开源双语对话语言模型 | An Open Bilingual Dialogue Language Model

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ChatGLM-MNN

Pure C++, Easy Deploy ChatGLM-6B.

Language:C++Stargazers:0Issues:0Issues:0

euler

A distributed graph deep learning framework.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

fastllm

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

Language:C++Stargazers:0Issues:0Issues:0

flash_attention_inference

Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.

Language:C++License:MITStargazers:0Issues:0Issues:0

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ggml

Tensor library for machine learning

Language:CLicense:MITStargazers:0Issues:0Issues:0

gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

graph-learn

An Industrial Graph Neural Network Framework

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

License:Apache-2.0Stargazers:0Issues:0Issues:0

LLM_Notes

LLM_Notes

Stargazers:0Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

License:Apache-2.0Stargazers:0Issues:0Issues:0

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现包括二次预训练、有监督微调、奖励建模、强化学习训练。

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Language:C++Stargazers:0Issues:0Issues:0

nann

A flexible, high-performance framework for large-scale retrieval problems based on TensorFlow.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

Needle

An imperative deep learning framework with customized GPU and CPU backend

Language:PythonStargazers:0Issues:0Issues:0

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

Language:C++License:MITStargazers:0Issues:0Issues:0

onnxconverter-common

Common utilities for ONNX converters

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

pdfs

Technically-oriented PDF Collection (Papers, Specs, Decks, Manuals, etc)

Language:HTMLStargazers:0Issues:0Issues:0

ppl.nn

A primitive library for neural network

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20

Language:C++License:MITStargazers:0Issues:0Issues:0

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

Torch2TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

torchrec

Pytorch domain library for recommendation systems

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0