NVIDIA Corporation's repositories
Megatron-LM
Ongoing research training transformer models at scale
TensorRT-LLM
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in a performant way.
cuda-python
CUDA Python: Performance meets Productivity
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
nv-ingest
NeMo Retriever extraction is a scalable, performance-oriented document content and metadata extraction microservice. NeMo Retriever extraction uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images that you can use in downstream generative applications.
gpu-operator
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
TensorRT-Model-Optimizer
A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
NeMo-Agent-Toolkit
The NVIDIA NeMo Agent toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents.
KAI-Scheduler
KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale
cuda-quantum
C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
NeMo-Skills
A project to improve skills of large language models
bionemo-framework
BioNeMo Framework: For building and adapting AI models in drug discovery at scale
JAX-Toolbox
JAX-Toolbox
mig-parted
MIG Partition Editor for NVIDIA GPUs
nim-deploy
A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.
recsys-examples
Examples for Recommenders - easy to train and deploy on accelerated infrastructure.
vgpu-device-manager
NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes
NV-Kernels
Ubuntu kernels which are optimized for NVIDIA server systems
doca-platform
DOCA Platform manages provisioning and service orchestration for Bluefield DPUs
spark-rapids-jni
RAPIDS Accelerator JNI For Apache Spark
cloud-native-docs
Documentation repository for NVIDIA Cloud Native Technologies
doca-sosreport
A unified tool for collecting system logs and other debug information