Tom-CaoZH

Zhang Cao's repositories

CXL-101

Contain some materials about CXL.

MIT6 10

TinyDB

Just a very simple database

Language:C++MIT100

Curve is a high-performance, lightweight-operation, cloud-native open source distributed storage system. Curve can be applied to: 1) mainstream cloud-native infrastructure platforms OpenStack and Kubernetes; 2) high-performance storage for cloud-native databases; 3) cloud storage middleware using S3-compatible object storage as a data storage.

Language:C++Apache-2.0000

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonApache-2.0000

paper_readings

Keep track of the papers I have read and to be read

MIT000

Ditto

This is the implementation repository of our SOSP'23 paper: Ditto: An Elastic and Adaptive Memory-Disaggregated Caching System.

000

LearningOS_Record

Record my daily process when learning os-comp2022-winter

MIT000

Leetcode

my solutions to some leetcode problems

Language:C++MIT010

LevelDBRead

To record some notes when I read the leveldb source code

Language:C++BSD-3-Clause000

LLaMA-Factory

Unify Efficient Fine-Tuning of 100+ LLMs

Apache-2.0000

llama.cpp

LLM inference in C/C++

MIT000

memkind

Memkind is an easy-to-use, general-purpose allocator which helps to fully utilize various kinds of memory available in the system, including DRAM, NVDIMM, and HBM

Language:CNOASSERTION000

mit_6.824

to record my study of mit 6.824

Language:GoMIT010

notes-pictures

MIT000

OncoMatcher

Language:PythonMIT000

opendal

OpenDAL: Access data freely, painlessly, and efficiently

Language:RustApache-2.0000

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:CMIT000

RocksDBRead

To record some notes when I read the rocksdb source code

GPL-2.0000

rpc_imp

implement a rpc framework using golang, just for exercise

Language:GoMIT010

runc

CLI tool for spawning and running containers according to the OCI specification

Apache-2.0000

rust_study

My rust study based on the cs110l course

MIT000

TensorRT

NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.

Language:C++Apache-2.0000

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0000

Tom-CaoZH

Zhang Cao's repositories

CXL-101

TinyDB

cuckoofilter

curve

FlexGen

paper_readings

Ditto

LearningOS_Record

Leetcode

LevelDBRead

LLaMA-Factory

llama.cpp

memkind

mit_6.824

notes-pictures

OncoMatcher

opendal

PowerInfer

RocksDBRead

rpc_imp

runc

rust_study

TensorRT

TensorRT-LLM

tests

Tom-CaoZH.github.io

vllm

xalloc

XD_EE_DSA_2022

zenfs