Ranjodh Singh's starred repositories
Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
deepsparse
Sparsity-aware deep learning inference runtime for CPUs
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
long-form-factuality
Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
LLM-Workshop
LLM Workshop by Sourab Mangrulkar
gaussian-head
Official repository for 'GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation'
indicnlp_corpus
Description Describes the IndicNLP corpus and associated datasets
languagecodec_tmp
Temporary anonymous version
torch-bnb-fp4
Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops