Yoshinari Fujinuma (akkikiki)

akkikiki

Geek Repo

Company:AWS AI Labs

Location:New York, USA

Home Page:http://akkikiki.github.io

Twitter:@akkikiki

Github PK Tool:Github PK Tool

Yoshinari Fujinuma's starred repositories

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:35110Issues:353Issues:305

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11886Issues:98Issues:436

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10706Issues:88Issues:296

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9326Issues:120Issues:129

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8869Issues:75Issues:1013

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5762Issues:46Issues:75

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4265Issues:111Issues:124

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4166Issues:47Issues:261

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2652Issues:30Issues:101

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:2463Issues:23Issues:24

deep_learning_curriculum

Language model alignment-focused deep learning curriculum

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonLicense:Apache-2.0Stargazers:995Issues:40Issues:66

shell-ai

LangChain powered shell command generator and runner CLI

Language:PythonLicense:MITStargazers:982Issues:14Issues:21

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:885Issues:7Issues:9

mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Language:PythonLicense:MITStargazers:813Issues:4Issues:25

template

This is the repository for the distill web framework

Language:JavaScriptLicense:Apache-2.0Stargazers:777Issues:14Issues:97

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:577Issues:18Issues:7

gpt_paper_assistant

GPT4 based personalized ArXiv paper assistant bot

Language:PythonLicense:Apache-2.0Stargazers:442Issues:6Issues:10

open_lm

A repository for research on medium sized language models.

Language:PythonLicense:MITStargazers:418Issues:22Issues:60

SAELens

Training Sparse Autoencoders on Language Models

Language:HTMLLicense:MITStargazers:234Issues:9Issues:72

CoLT5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

Language:PythonLicense:MITStargazers:218Issues:8Issues:7

sae

Sparse autoencoders

Language:PythonLicense:MITStargazers:202Issues:5Issues:1
Language:PythonLicense:NOASSERTIONStargazers:130Issues:5Issues:7

mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

Language:PythonLicense:MITStargazers:100Issues:8Issues:0

kotomamba

Mamba training library developed by kotoba technologies

Language:PythonLicense:Apache-2.0Stargazers:61Issues:5Issues:0

MLLM-Judge

[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.

Language:PythonStargazers:23Issues:0Issues:0