Zhang Cao's repositories
Tom-CaoZH.github.io
This is my homepage.
LLaMA-Factory
Unify Efficient Fine-Tuning of 100+ LLMs
llama.cpp
LLM inference in C/C++
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
TensorRT
NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.
FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
xalloc
This lib is used to allocate normal DRAM-based memory and CXL-based memory using Rust.
Ditto
This is the implementation repository of our SOSP'23 paper: Ditto: An Elastic and Adaptive Memory-Disaggregated Caching System.
runc
CLI tool for spawning and running containers according to the OCI specification
memkind
Memkind is an easy-to-use, general-purpose allocator which helps to fully utilize various kinds of memory available in the system, including DRAM, NVDIMM, and HBM
zenfs
ZenFS is a storage backend for RocksDB that enables support for ZNS SSDs and SMR HDDs.
curve
Curve is a high-performance, lightweight-operation, cloud-native open source distributed storage system. Curve can be applied to: 1) mainstream cloud-native infrastructure platforms OpenStack and Kubernetes; 2) high-performance storage for cloud-native databases; 3) cloud storage middleware using S3-compatible object storage as a data storage.
Leetcode
my solutions to some leetcode problems
opendal
OpenDAL: Access data freely, painlessly, and efficiently
XD_EE_DSA_2022
my solution to XDU EE data structure and algorithm
LearningOS_Record
Record my daily process when learning os-comp2022-winter
LevelDBRead
To record some notes when I read the leveldb source code
RocksDBRead
To record some notes when I read the rocksdb source code
paper_readings
Keep track of the papers I have read and to be read
TinyDB
Just a very simple database
mit_6.824
to record my study of mit 6.824
rpc_imp
implement a rpc framework using golang, just for exercise
rust_study
My rust study based on the cs110l course