VatsaDev

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Language:PythonApache-2.02559 24 162

cramming

Cramming the training of a (BERT-type) language model into limited compute.

Language:PythonMIT1271 22 34

yet-another-applied-llm-benchmark

A benchmark to evaluate language models on questions I've previously asked them to solve.

Language:PythonGPL-3.0810 17 9

quiet-star

Code for Quiet-STaR

Language:PythonApache-2.0354 13 7

galactic

data cleaning and curation for unstructured text

Language:PythonApache-2.0323 8 4

TPU-Alignment

Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free

Language:Jupyter NotebookApache-2.0209 7 10

mamba.c

Inference of Mamba models in pure C

Language:C175 6 2

xVal

Repository for code used in the xVal paper

Language:Jupyter Notebook107 19 5

gptcore

Fast modular code to create and train cutting edge LLMs

Language:PythonApache-2.061 9 7

othello_mamba

Evaluating the Mamba architecture on the Othello game

Language:Python39 3 1

quartic-transformer

Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)

Language:PythonMIT39 40

FaceRWKV

Course Project for COMP4471 on RWKV

Language:Jupyter Notebook16 30

Lilith

Using the lilith optimizer on nanogpt

Language:PythonMIT9 20

tiny-asic-4bit-matrix-mul

Tiny matrix multiplication ASIC with 4-bit math

Language:VerilogApache-2.03 20

AIMO-Competition

Language:Jupyter NotebookUnlicense200

JsLabs

make and store pages in the url, using url vars

Language:JavaScriptMIT2 20

TransformerMath

Can transformers learn math, like patterns?

Language:PythonMIT200

2024-Swerve-concept

Describing Swerve functionality, mockup math

Language:Python100

opencv-images

Language:Python100

NCPT-Lilith

A retrain of the old nanogpt, but with the lilith optimizer

Language:PythonMIT1 20

todo-fbla

the todo fbla project

Language:HTMLMIT1 20