snimu

Minimalistic, fast, and experimentation-friendly researcher's toolbench for GPT-like models in ~<365 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in ~138 seconds.

Language:PythonApache-2.0000

llm-parameter-stats

How do parameter statistics change over training in LLMs?

Language:PythonApache-2.0020

neuralsort

Sort lists with the help of an ANN to allow maximal parallelism in execution.

Language:Python010

parameter-checks

Extend typehints to include dynamic checks (that might otherwise be dealt with by assertions) in Python.

Language:PythonMIT010

px4-simulation-ignition

Fix issue #19981 on PX4-Autopilot

Language:C++000

torch-benchmarks

Performance benchmark for PyTorch models

Language:PythonMIT020

hlb-gpt-value-activation

Check out how much of a difference the activation of the value makes vs. keeping it linear as in standard attention

Language:PythonApache-2.0000

llm-small-to-large

1. Train small LLM; 2. Use its outputs on the training data as labels for training large LLM, where their argmax agrees with the training data.

Language:PythonApache-2.0000

mask

Some experiments with Attention masks

Language:PythonApache-2.0000

plan-act

A better way for LLMs to plan before acting.

Language:PythonApache-2.0000

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with LLMs faster and more controllable.

Language:PythonApache-2.0000

torch-nested

Easily manipulate torch.Tensors inside highly nested data-structures.

Language:PythonMIT010

torchinfo

View model summaries in PyTorch!

MIT000

typing-exe

Executable typehints for Python: make assertions about and/or modify parameters & return values

Language:PythonMIT010

ul2

How much information can we extract from one token?

Language:PythonApache-2.0000