Lu.dev's starred repositories
onnx-tensorrt
ONNX-TensorRT: TensorRT backend for ONNX
text-generation-inference
Large Language Model Text Generation Inference
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
flash-attention
Fast and memory-efficient exact attention
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
chatglm.cpp
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
whisper.cpp
Port of OpenAI's Whisper model in C/C++
the-algorithm-ml
Source code for Twitter's Recommendation Algorithm
the-algorithm
Source code for Twitter's Recommendation Algorithm
TensorLayer
Deep Learning and Reinforcement Learning Library for Scientists and Engineers
openmlsys-zh
《Machine Learning Systems: Design and Implementation》- Chinese Version