shamio

followers

following

stars

https://www.zendegiyesabz.com/

shamio's starred repositories

llama.cpp

LLM inference in C/C++

Language:C++MIT59955 508 3175

text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

Language:PythonAGPL-3.037620 325 3463

unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.011321 80 470

MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Language:PythonApache-2.010497 109 605

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++MIT7076 75 135

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT6303 61 76

koboldcpp

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI

Language:C++AGPL-3.04183 61 609

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonApache-2.04117 41 157

lollms-webui

Lord of Large Language Models Web User Interface

Language:VueApache-2.03990 59 270

mergekit

Tools for merging pretrained large language models.

Language:PythonLGPL-3.03827 47 237

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language:PythonApache-2.02001 27 143

languagemodels

Explore large language models in 512MB of RAM

Language:PythonMIT1160 10 17

Long-Context

This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.

Language:PythonApache-2.0564 12 6

exui

Web UI for ExLlamaV2

Language:JavaScriptMIT374 8 41

neural-speed

An innovative library for efficient LLM inference via low-bit quantization

Language:C++Apache-2.0287 8 41

qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Language:PythonApache-2.0253 6 4

LongAlign

LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation

Language:PythonApache-2.0121 8 8

Entropy-ABF

Official implementation for 'Extending LLMs’ Context Window with 100 Samples'

Language:Python70 3 2

read-agent.github.io

Language:Jupyter Notebook44 3 1