Yoshinari Fujinuma (akkikiki)

akkikiki

Geek Repo

Company:AWS AI Labs

Location:New York, USA

Home Page:http://akkikiki.github.io

Twitter:@akkikiki

Github PK Tool:Github PK Tool

Yoshinari Fujinuma's starred repositories

function_vectors

Function Vectors in Large Language Models (ICLR 2024)

Language:PythonStargazers:81Issues:0Issues:0

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonLicense:Apache-2.0Stargazers:946Issues:0Issues:0

sae

Sparse autoencoders

Language:PythonLicense:MITStargazers:108Issues:0Issues:0

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4030Issues:0Issues:0
Language:PythonLicense:MITStargazers:624Issues:0Issues:0

SAELens

Training Sparse Autoencoders on Language Models

Language:HTMLLicense:MITStargazers:198Issues:0Issues:0

MLLM-Judge

[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.

Language:PythonStargazers:20Issues:0Issues:0

template

This is the repository for the distill web framework

Language:JavaScriptLicense:Apache-2.0Stargazers:773Issues:0Issues:0
Language:TeXLicense:MITStargazers:40Issues:0Issues:0
Language:PythonStargazers:38Issues:0Issues:0

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:565Issues:0Issues:0

deep_learning_curriculum

Language model alignment-focused deep learning curriculum

Stargazers:1162Issues:0Issues:0

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:843Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5619Issues:0Issues:0

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2560Issues:0Issues:0

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:2435Issues:0Issues:0

mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Language:PythonLicense:MITStargazers:772Issues:0Issues:0

kotomamba

Mamba training library developed by kotoba technologies

Language:PythonLicense:Apache-2.0Stargazers:59Issues:0Issues:0

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11500Issues:0Issues:0

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4178Issues:0Issues:0

gpt_paper_assistant

GPT4 based personalized ArXiv paper assistant bot

Language:PythonLicense:Apache-2.0Stargazers:436Issues:0Issues:0

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:8693Issues:0Issues:0

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:34449Issues:0Issues:0
Language:PythonLicense:MITStargazers:89Issues:0Issues:0

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9140Issues:0Issues:0

open_lm

A repository for research on medium sized language models.

Language:PythonLicense:MITStargazers:343Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:129Issues:0Issues:0

CoLT5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

Language:PythonLicense:MITStargazers:218Issues:0Issues:0

mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts

Language:PythonLicense:MITStargazers:100Issues:0Issues:0

shell-ai

LangChain powered shell command generator and runner CLI

Language:PythonLicense:MITStargazers:973Issues:0Issues:0