SriKrishna Paparaju's repositories
AITemplate
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
autogen
Enable Next-Gen Large Language Model Applications. Join our Discord: https://discord.gg/pAbnFJrkgZ
AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
chroma
the AI-native open-source embedding database
composer
Supercharge Your Model Training
dcgm-exporter
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
faiss
A library for efficient similarity search and clustering of dense vectors.
flash-attention
Fast and memory-efficient exact attention
gpt-fast
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
HIP
HIP: C++ Heterogeneous-Compute Interface for Portability
llama
Inference code for LLaMA models
llama_index
LlamaIndex (formerly GPT Index) is a data framework for your LLM applications
llm-foundry
LLM training code for MosaicML foundation models
material-dashboard
Material Dashboard - Open Source Bootstrap 5 Material Design Admin
NeMo
NeMo: a toolkit for conversational AI
Python-Algorithms
All Algorithms implemented in Python
pytorch-lightning
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
safetensors
Simple, safe way to store and distribute tensors
semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
thanos
Highly available Prometheus setup with long term storage capabilities. CNCF Sandbox project.
tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
torchfix
TorchFix - a linter for PyTorch-using code with autofix support
transformers
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
triton
Development repository for the Triton language and compiler
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs