Anoop's repositories
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
gorilla
Gorilla: An API store for LLMs
arcee-trainium-recipes
The repository contains all the set-up required to execute trainium training jobs.
grammar-based-agents
Modular open LLM agents via prompt chaining and schema-guided generation
lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
peft_lora
Repo accompanying PEFT/LoRA article.
ipyexperiments
Automatic GPU+CPU memory profiling, re-use and memory leaks detection using jupyter/ipython experiment containers
TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
qlora
QLoRA: Efficient Finetuning of Quantized LLMs
NexusRaven
NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRaven-13B and baselines.