spaparaju

SriKrishna Paparaju's repositories

accelerate

🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision

Language:PythonApache-2.0000

DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Language:C++Apache-2.0000

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

NOASSERTION000

dcgm-exporter

NVIDIA GPU metrics exporter for Prometheus leveraging DCGM

Language:GoApache-2.0000

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonApache-2.0000

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Apache-2.0000

DeepSpeedExamples

Example models using DeepSpeed

Language:PythonApache-2.0000

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Apache-2.0000

dspy

DSPy: The framework for programming—not prompting—foundation models

MIT000

faiss

A library for efficient similarity search and clustering of dense vectors.

MIT000

gpu-operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

Language:GoApache-2.0000

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

MIT000

ignite

High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

BSD-3-Clause000

jupyterlab-nvdashboard

A JupyterLab extension for displaying dashboards of GPU usage.

Language:TypeScriptBSD-3-Clause000

kuberay

A toolkit to run Ray applications on Kubernetes

Language:GoApache-2.0000

Megatron-LM

Ongoing research training transformer models at scale

NOASSERTION000

mlflow

Open source platform for the machine learning lifecycle

Apache-2.0000

nim-anywhere

Accelerate your Gen AI with NVIDIA NIM and NVIDIA AI Workbench

Language:PythonApache-2.0000

nim-deploy

A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deployment.

Language:Jupyter NotebookApache-2.0000

optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Apache-2.0000

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Language:PythonApache-2.0000

streaming

A Data Streaming Library for Efficient Neural Network Training

Apache-2.0000

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0000

TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

Language:PythonNOASSERTION000

spaparaju

SriKrishna Paparaju's repositories

accelerate

DALI

dbrx

dcgm-exporter

DeepSpeed

DeepSpeed-MII

DeepSpeedExamples

diffusers

diffusion

dspy

faiss

foundation-model-stack

gpu-operator

graphrag

ignite

jupyterlab-nvdashboard

kuberay

Megatron-LM

mlflow

nim-anywhere

nim-deploy

optimum

ray

streaming

TensorRT-LLM

TensorRT-Model-Optimizer

torchtune

transformers

triton

vllm