Chaitanya Sri Krishna Lolla (lcskrishna)

lcskrishna

Geek Repo

Company:AMD

Location:Santa Clara, California

Home Page:https://lcskrishna.github.io/

Twitter:@chaitan15444482

Github PK Tool:Github PK Tool

Chaitanya Sri Krishna Lolla's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:46509Issues:307Issues:659

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:36060Issues:367Issues:312

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:25927Issues:222Issues:4267

awesome-robotic-tooling

Tooling for professional robotic development in C++ and Python with a touch of ROS, autonomous driving and aerospace.

pytorchviz

A small package to create visualizations of PyTorch execution graphs

Language:Jupyter NotebookLicense:MITStargazers:3159Issues:31Issues:63

fairscale

PyTorch extensions for high performance and large scale training.

Language:PythonLicense:NOASSERTIONStargazers:3133Issues:45Issues:359

ComputeLibrary

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

flops-counter.pytorch

Flops counter for convolutional networks in pytorch framework

Language:PythonLicense:MITStargazers:2766Issues:15Issues:96

neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Language:PythonLicense:Apache-2.0Stargazers:2134Issues:35Issues:197
Language:PythonLicense:NOASSERTIONStargazers:2032Issues:84Issues:18

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:1780Issues:34Issues:305

intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Language:PythonLicense:Apache-2.0Stargazers:1540Issues:36Issues:517

open-gpu-doc

Documentation of NVIDIA chip/hardware interfaces

Language:CLicense:MITStargazers:1236Issues:97Issues:0

brevitas

Brevitas: neural network quantization in PyTorch

Language:PythonLicense:NOASSERTIONStargazers:1141Issues:34Issues:430

DeepBench

Benchmarking Deep Learning operations on different hardware

Language:C++License:Apache-2.0Stargazers:1065Issues:110Issues:71

DNS-Challenge

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Language:PythonLicense:CC-BY-4.0Stargazers:1054Issues:49Issues:146

resource-stream

CUDA related news and material links

pytorch_memlab

Profiling and inspecting memory in pytorch

Language:PythonLicense:MITStargazers:1011Issues:13Issues:35

gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

Language:C++License:MITStargazers:845Issues:55Issues:183

kineto

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

Language:HTMLLicense:NOASSERTIONStargazers:684Issues:29Issues:211

oneMKL

oneAPI Math Kernel Library (oneMKL) Interfaces

Language:C++License:Apache-2.0Stargazers:605Issues:47Issues:177

multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Language:CudaLicense:BSD-3-ClauseStargazers:521Issues:27Issues:10

PyProf

A GPU performance profiling tool for PyTorch models

Language:PythonLicense:Apache-2.0Stargazers:490Issues:20Issues:0

cuda-quantum

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows

Language:C++License:NOASSERTIONStargazers:468Issues:22Issues:652

ort

Accelerate PyTorch models with ONNX Runtime

Language:PythonLicense:MITStargazers:353Issues:24Issues:37

contiguous_pytorch_params

Accelerate training by storing parameters in one contiguous chunk of memory.

NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

Language:CLicense:Apache-2.0Stargazers:275Issues:11Issues:35

Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

Language:C++License:NOASSERTIONStargazers:245Issues:17Issues:610

pytorch-docker-armv7

pytorch for RaspberryPi

Language:DockerfileLicense:GPL-3.0Stargazers:4Issues:3Issues:0