zhangkaihuo's repositories
Auto-GPT
An experimental open-source attempt to make GPT-4 fully autonomous.
BMTrain
Efficient Training (including pre-training and fine-tuning) for Big Models
CGBN
CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups
CINN
Compiler Infrastructure for Neural Networks
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
CUDALibrarySamples
CUDA Library Samples
docs
Documentations for PaddlePaddle
flash-attention
Fast and memory-efficient exact attention
glake
GLake: optimizing GPU memory management and IO transmission.
hipSPARSE
ROCm SPARSE marshalling library
libff
C++ library for Finite Fields and Elliptic Curves
llama.cpp
Port of Facebook's LLaMA model in C/C++
llmfarm_core.swift
Swift library to work with llama and other large language models.
mediapipe
Cross-platform, customizable ML solutions for live and streaming media.
MiniCPM
MiniCPM-2B: An end-side LLM outperforms Llama2-13B.
mlc-MiniCPM
MiniCPM on Android platform.
open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
Paddle3D
A 3D computer vision development toolkit based on PaddlePaddle. It supports point-cloud object detection, segmentation, and monocular 3D object detection models.
PaddleFleetX
Paddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC LocalSGD Wide&Deep
PaddleNLP
An NLP library with Awesome pre-trained Transformer models and easy-to-use interface, supporting wide-range of NLP tasks from research to industrial applications.
rapidsnark
fast zksnark prover
second.pytorch
SECOND for KITTI/NuScenes object detection
spconv
Spatial Sparse Convolution Library
sppark
Zero-knowledge template library
TestDatas
test