justheuristic's starred repositories
text-generation-webui
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
llama-recipes
Scripts for fine-tuning Llama2 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization & question answering. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment.Demo apps to showcase Llama2 for WhatsApp & Messenger
StableCascade
Official Code for Stable Cascade
GitTorrent
A decentralization of GitHub using BitTorrent and Bitcoin
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
knn-transformers
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT
FastBinarySearch
Fast and vectorizable algorithms for searching in a vector of sorted floating point numbers
local-search-quantization
State-of-the-art method for large-scale ANN search as of Oct 2016. Presented at ECCV 16.
fast-hadamard-transform
Fast Hadamard transform in CUDA, with a PyTorch interface
weighted-low-rank-bert-compression
Using weighted low-rank approximation to compress BERT.