There are 247 repositories under cuda topic.
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
Build and run Docker containers leveraging NVIDIA GPUs
Instant neural graphics primitives: lightning fast NeRF and more
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Modular ZK(Zero Knowledge) backend accelerated by GPU
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Go package for computer vision using OpenCV 4 and beyond. Includes support for DNN, CUDA, OpenCV Contrib, and OpenVINO.
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
A PyTorch Library for Accelerating 3D Deep Learning Research
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
Fast inference engine for Transformer models
Lightning fast C++/CUDA neural network framework
📚Modern CUDA Learn Notes: 200+ Tensor/CUDA Cores Kernels🎉, HGEMM, FA2 via MMA and CuTe, 98~100% TFLOPS of cuBLAS/FA2.
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.