Madison May's starred repositories
the-algorithm
Source code for Twitter's Recommendation Algorithm
PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
azure-docs
Open source documentation of Microsoft Azure
silero-models
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
flash-attention
Fast and memory-efficient exact attention
llm-security
New ways of breaking app-integrated LLMs
LlamaAcademy
A school for camelids
awesome-document-understanding
A curated list of resources for Document Understanding (DU) topic
ThoughtSource
A central, open resource for data and tools related to chain-of-thought reasoning in large language models. Developed @ Samwald research group: https://samwald.info/
MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
ema-pytorch
A simple way to keep track of an Exponential Moving Average (EMA) version of your pytorch model
simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
perceiver-ar-pytorch
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
llama_index
LlamaIndex (formerly GPT Index) is a data framework for your LLM applications