NVIDIA Corporation's repositories
Megatron-LM
Ongoing research training transformer models at scale
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
nvidia-container-toolkit
Build and run containers leveraging NVIDIA GPUs
GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
gpu-operator
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
NeMo-Aligner
Scalable toolkit for efficient model alignment
cuda-quantum
C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
NeMo-Curator
Scalable data pre processing and curation toolkit for LLMs
NeMo-text-processing
NeMo text processing for ASR and TTS
JAX-Toolbox
JAX-Toolbox
metropolis-nim-workflows
Collection of reference workflows for building intelligent agents with NIMs
TensorRT-Incubator
Experimental projects related to TensorRT
spark-rapids-ml
Spark RAPIDS MLlib – accelerate Apache Spark MLlib with GPUs
spark-rapids-jni
RAPIDS Accelerator JNI For Apache Spark
k8s-nim-operator
An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.
cloud-native-docs
Documentation repository for NVIDIA Cloud Native Technologies
PLDM-unpack
Tool to unpack or parse PLDM (Platform Level Data Model v1.0.1) firmware update files.