Chaitanya Sri Krishna Lolla's repositories
nccl-rccl-parser
Tool to run rccl-tests/nccl-tests based on from an application and gather performance.
pt-aten-logger
PyTorch op instrumentor
aiter
AI Tensor Engine for ROCm
ao
PyTorch native quantization and sparsity for training and inference
ClassyVision
An end-to-end PyTorch framework for image and video classification
cupy
A NumPy-compatible array library accelerated by CUDA
DeepLearningExamples
Deep Learning Examples
fmwork
FM Benchmarking Framework
ieee-sps-ml-tutorial
IEEE SPS Machine Learning Training tutorial
Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
my-pytorch-experiments
Contains some of my pytorch experiments and practise
sglang
SGLang is a fast serving framework for large language models and vision language models.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
training_results_v0.7
Training v0.7 results
transformers
🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
vision
Datasets, Transforms and Models specific to Computer Vision
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
VMZ
VMZ: Model Zoo for Video Modeling