Beast code in Giters

kaballas's repositories

DiskVectorIndex

Language:PythonApache-2.0100

AISECKG-QA-Dataset

Language:Jupyter Notebook000

app

010

augmentoolkit

Convert Compute And Books Into Instruct-Tuning Datasets

Language:PythonMIT000

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:JavaScriptMIT000

Autotrain

010

BinaryVectorDB

Efficient vector database for hundred millions of embeddings.

Language:PythonApache-2.0000

chartify

Python library that makes it easy for data scientists to create charts.

Language:PythonApache-2.0000

chess_llm_interpretability

Visualizing the internal board state of a GPT trained on chess PGN strings, and performing interventions on its internal board state and representation of player Elo.

Language:Jupyter Notebook000

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonMIT000

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonMIT000

lila-websocket

Experimental WebSocket server for lichess.org - superceded by https://github.com/ornicar/lila-ws

Language:RustAGPL-3.0000

local-gemma

Gemma 2 optimized for your local machine.

Language:PythonApache-2.0000

LongLM

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Language:PythonMIT000

rags

Build ChatGPT over your data, all with natural language

Language:PythonMIT000

vsaq

VSAQ is an interactive questionnaire application to assess the security programs of third parties.

Language:JavaScriptApache-2.0000

groqbook

Groqbook: Generate entire books in seconds using Groq and Llama3

MIT000

llm.c

LLM training in simple, raw C/CUDA

MIT000

mem0

The memory layer for Personalized AI

000

micrograd

A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

Language:Jupyter NotebookMIT000

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

MIT000

MInference

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

MIT000

Minitron

A family of compressed models obtained via pruning and knowledge distillation

000

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

MIT000

RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Language:PythonApache-2.0000

Samba

Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"

Language:PythonMIT000

SmallLanguageModel-project

a LLM cookbook, for building your own from scratch, all the way from gathering data to training a model

MIT000

storm

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

MIT000

tensorgrad

MIT000

Tricycle

Autograd to GPT-2 completely from scratch

000