simudt

Simu's repositories

Griffin-Jax

Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"

Language:PythonApache-2.01000

mpi-ds

MPI Operator DeepSpeed Base Configuration for CIFAR-10

Language:Dockerfile400

miniF2F-code

Dataset of formal Olympiad-level mathematics problems solved with Python code instructions.

Language:Shell3 10

Tri-RMSNorm

Efficient kernel for RMS normalization with fused operations, includes both forward and backward passes, compatibility with PyTorch.

Language:PythonApache-2.0300

LongConv-Jax

Jax/Flax/Linen implementation of "Simple Hardware-Efficient Long Convolutions for Sequence Modeling"

Language:PythonApache-2.0200

triton-activations

Collection of neural network activation function kernels for Triton Language Compiler by OpenAI

Language:Python200

GradientAscent-Jax

Custom gradient ascent solver (optimizer) for JAX/Flax models

Language:PythonApache-2.0100

kmeansops

PyKeops Powered K-Means Clustering Algorithms Module both on CPU & GPU

Language:Python100

lmppl-cli-csv-wrapper

A tiny CLI wrapper around lmppl for Pre-Trained Language Models Perplexity Calculation for CSV files

Language:Python100

Mixture-of-Depths-Jax

Jax module for the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Language:PythonApache-2.0100

Ring-Attention-Jax

Packaged Ring Attention with Blockwise Transformers for Near-Infinite Context implemented in Jax + Flax.

Language:PythonApache-2.0100

Python-Template

Python Package Template is all you need

Language:PythonApache-2.0000

simudt.github.io

blog for the AI era

Language:SCSSMIT010

Composable-Datasets

Transform JSONL Q&A datasets to instruct format with ease

Language:PythonApache-2.0000

jax-triton

jax-triton contains integrations between JAX and OpenAI Triton

Language:PythonApache-2.0000

MEGABYTE-pytorch-DS

Modificated DeepSpeed training setup fork of MEGABYTE - PyTorch by lucidrains, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch

Language:PythonMIT000

PaLM-rlhf-pytorch-DS

Modificated DeepSpeed training setup fork of RLHF (Reinforcement Learning with Human Feedback) by lucidrains on top of the PaLM architecture. Basically ChatGPT but with PaLM

Language:PythonMIT000

Simba

A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series"

Language:PythonMIT000

zeta

Build high-performance AI models with modular building blocks

Language:PythonApache-2.0000