George Ang's repositories
alertmanager
Prometheus Alertmanager
AudioModem
Transfer data using microphone/speaker on iOS devices
bagua
Bagua Speeds up PyTorch
BMTrain
Efficient Training (including pre-training and fine-tuning) for Big Models
ColossalAI
Colossal-AI: A Unified Deep Learning System for Big Model Era
CommunicationDebugger
Network communication debugger iOS app Swift source code
EpicSurvivalGame
Third-person Survival Game for Unreal Engine 4 (Sample Project)
go-sendcloud
SendCloud mail API client in Go
cutlass
CUDA Templates for Linear Algebra Subroutines
flash-attention
Fast and memory-efficient exact attention
go-tcp-proxy
A small TCP proxy written in Go
ModelCenter
Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
NOTTaskPaperForIOS
Source code for the original TaskPaper for iOS.
pytorch-gpu-benchmark
Using the famous cnn model in Pytorch, we run benchmarks on various gpu.
socket.IO-objc
socket.io v0.7.2+ for iOS devices
stream-lua-nginx-module
Embed the power of Lua into NGINX TCP/UDP servers
supervisor
Supervisor process control system for UNIX
tensorflow
An Open Source Machine Learning Framework for Everyone
TensorFlow-Examples
TensorFlow Tutorial and Examples for Beginners with Latest APIs
TLLM_QMM
TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pytorch module. We modified the dequantation and weight preprocessing to align with popular quantization alogirthms such as AWQ and GPTQ, and combine them with new FP8 quantization.
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
ZhiLight
A highly optimized LLM inference acceleration engine for Llama and its variants.