Qizhen WENG's repositories
awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
ColossalAI
Making large AI models cheaper, faster and more accessible
credentials-nodejs
Alibaba Cloud Credentials for TypeScript/Node.js
FastChat
An open platform for training, serving, and evaluating large languages. Release repo for Vicuna and FastChat-T5.
hkust-latex-thesis-template
A Better HKUST LaTeX Thesis Template
horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
open-simulator
K8s cluster simulator for capacity planning
qzweng.github.io
My Academic Personal Pages:
skypilot
SkyPilot is a framework for easily running machine learning workloads on any cloud through a unified interface.
DeepPlan
Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access (ACM EuroSys '23)
FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
HeliosArtifact
HeliosArtifact
incubator-mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
k8s-device-plugin
OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow applications to access larger memory space than its physical capacity. It is designed for ease of use of extended device memory for AI workloads.
k8s-vgpu-scheduler
OpenAIOS vGPU scheduler for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory.
obsidian-things
An Obsidian theme inspired by the beautifully-designed app, Things.
ps-lite
A lightweight parameter server interface
qzweng.github.io-202308
My Academic Personal Pages:
ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
seed_rl
SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.
typeset
自动修正中文、英文、代码混合排版中的全半角、空格等问题
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
xtuner
An efficient, flexible and full-featured toolkit for fine-tuning large models (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)