Kye Gomez's repositories
MultiModalMamba
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
VisionMamba
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory when performing batch inference to extract features on high-res images
awesome-multi-agent-papers
A compilation of the best multi-agent papers
Python-Package-Template
A easy, reliable, fluid template for python packages complete with docs, testing suites, readme's, github workflows, linting and much much more
Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTORCH
Lets-Verify-Step-by-Step
"Improving Mathematical Reasoning with Process Supervision" by OPENAI
NeoSapiens
The next evolution of Agents
Reka-Torch
Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch
swarms-cloud
Deploy your autonomous agents to production grade environments with 99% Uptime Guarantee, Infinite Scalability, and self-healing.
MambaFormer
Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks"
MLXTransformer
Simple Implementation of a Transformer in the new framework MLX by Apple
BRAVE-ViT-Swarm
Implementation of the paper: "BRAVE : Broadening the visual encoding of vision-language models"
SimplifiedTransformers
SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-blocks, and normalization layers are removed. Experimental results confirm similar training speed and performance.
CELESTIAL-1
Omni-Modality Processing, Understanding, and Generation
GiediPrime
An experimental architecture using Mixture of Attentions with sandwiched Maracron Feedforward's and other modules