Saravana Periyasamy (sara4dev)

sara4dev

Geek Repo

Company:nvidia

Location:Dallas, TX

Twitter:@sara4dev

Github PK Tool:Github PK Tool


Organizations
meygam

Saravana Periyasamy's starred repositories

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:90804Issues:679Issues:7409

papers-we-love

Papers from the computer science community to read and discuss.

llama.cpp

LLM inference in C/C++

autogen

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:29714Issues:361Issues:1560

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:24924Issues:219Issues:4003

skywalking

APM, Application Performance Monitoring System

Language:JavaLicense:Apache-2.0Stargazers:23612Issues:837Issues:5280

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:22669Issues:222Issues:129

qdrant

Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Language:RustLicense:Apache-2.0Stargazers:19481Issues:119Issues:1181

candle

Minimalist ML framework for Rust

Language:RustLicense:Apache-2.0Stargazers:14881Issues:147Issues:656

triton

Development repository for the Triton language and compiler

ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Language:PythonLicense:Apache-2.0Stargazers:11050Issues:193Issues:1062

mise

dev tools, env vars, task runner

Language:RustLicense:MITStargazers:8716Issues:27Issues:1014

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7935Issues:85Issues:1715

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonLicense:BSD-3-ClauseStargazers:7900Issues:141Issues:3661

nvtop

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

Language:CLicense:NOASSERTIONStargazers:7877Issues:77Issues:237

rancher-desktop

Container Management and Kubernetes on the Desktop

Language:TypeScriptLicense:Apache-2.0Stargazers:5775Issues:53Issues:3607

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:5112Issues:106Issues:985

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3894Issues:32Issues:1265

digger

Digger is an open source IaC orchestration tool. Digger allows you to run IaC in your existing CI pipeline ⚡️

Language:GoLicense:Apache-2.0Stargazers:2835Issues:19Issues:421

bpftop

bpftop provides a dynamic real-time view of running eBPF programs. It displays the average runtime, events per second, and estimated total CPU % for each program.

Language:CLicense:Apache-2.0Stargazers:2149Issues:152Issues:18

lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Language:PythonLicense:Apache-2.0Stargazers:2011Issues:33Issues:228

axlearn

An Extensible Deep Learning Library

Language:PythonLicense:Apache-2.0Stargazers:1722Issues:61Issues:9

inference

Reference implementations of MLPerf™ inference benchmarks

Language:PythonLicense:Apache-2.0Stargazers:1165Issues:59Issues:783

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:988Issues:14Issues:88
Language:PythonLicense:Apache-2.0Stargazers:856Issues:40Issues:63

llmperf

LLMPerf is a library for validating and benchmarking LLMs

Language:PythonLicense:Apache-2.0Stargazers:524Issues:9Issues:23

nxs-universal-chart

The Helm chart you can use to install any of your applications into Kubernetes/OpenShift

Language:SmartyLicense:Apache-2.0Stargazers:365Issues:17Issues:27

rules_oci

Bazel rules for building OCI containers

Language:StarlarkLicense:Apache-2.0Stargazers:269Issues:10Issues:269

k8s-dra-driver

Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes

Language:GoLicense:Apache-2.0Stargazers:210Issues:15Issues:39