Yoshinari Fujinuma (akkikiki)

akkikiki

Geek Repo

Company:AWS AI Labs

Location:New York, USA

Home Page:http://akkikiki.github.io

Twitter:@akkikiki

Github PK Tool:Github PK Tool

Yoshinari Fujinuma's starred repositories

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:35930Issues:368Issues:312

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:12317Issues:100Issues:484

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9457Issues:120Issues:136

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:9129Issues:72Issues:1063

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5923Issues:47Issues:78

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4407Issues:109Issues:133

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4398Issues:49Issues:288

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2753Issues:30Issues:107

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:2518Issues:23Issues:26

deep_learning_curriculum

Language model alignment-focused deep learning curriculum

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonLicense:Apache-2.0Stargazers:1069Issues:42Issues:74

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:931Issues:7Issues:9

mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Language:PythonLicense:MITStargazers:874Issues:7Issues:35

ultravox

A fast multimodal LLM for real-time voice

Language:PythonLicense:MITStargazers:788Issues:21Issues:19

template

This is the repository for the distill web framework

Language:JavaScriptLicense:Apache-2.0Stargazers:785Issues:13Issues:98

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:582Issues:18Issues:7

gpt_paper_assistant

GPT4 based personalized ArXiv paper assistant bot

Language:PythonLicense:Apache-2.0Stargazers:465Issues:6Issues:10

open_lm

A repository for research on medium sized language models.

Language:PythonLicense:MITStargazers:463Issues:21Issues:63

SAELens

Training Sparse Autoencoders on Language Models

Language:HTMLLicense:MITStargazers:341Issues:8Issues:87

sae

Sparse autoencoders

Language:PythonLicense:MITStargazers:272Issues:7Issues:10

CoLT5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

Language:PythonLicense:MITStargazers:222Issues:8Issues:7

blogcaster

Python tools for easily translating your blog content to podcasts & YouTube

Language:PythonLicense:Apache-2.0Stargazers:196Issues:2Issues:11
Language:PythonLicense:NOASSERTIONStargazers:131Issues:5Issues:8

function_vectors

Function Vectors in Large Language Models (ICLR 2024)

mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

Language:PythonLicense:MITStargazers:101Issues:8Issues:0

kotomamba

Mamba training library developed by kotoba technologies

Language:PythonLicense:Apache-2.0Stargazers:62Issues:5Issues:0

MLLM-Judge

[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.

Language:PythonStargazers:43Issues:1Issues:0