Saravana Periyasamy's starred repositories
papers-we-love
Papers from the computer science community to read and discuss.
skywalking
APM, Application Performance Monitoring System
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
rancher-desktop
Container Management and Kubernetes on the Desktop
flashinfer
FlashInfer: Kernel Library for LLM Serving
nxs-universal-chart
The Helm chart you can use to install any of your applications into Kubernetes/OpenShift
k8s-dra-driver
Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes