okoge-kaz

followers

following

stars

Tokyo Institute of Technology

Tokyo Japan

Organizations

llm-jp

rioyokotalab

SakanaAI

sbintuitions

turingmotors

Kazuki Fujii's repositories

moe-recipes

Ongoing research training Mixture of Expert models.

Language:Python16 30

llm-recipes

Ongoing Research Project for continaual pre-training LLM(dense mode)

Language:Python15 2 1

wandb_watcher

ABCI 大規模言語モデル構築支援にてwandbのジョブを監視するためのツール

Language:Python2 10

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION100

swallow-tuning

Language:Shell1 10

axlearn

Language:PythonApache-2.0000

deploymentmanager-samples

Deployment Manager samples and templates.

Apache-2.0000

grouped_gemm

PyTorch bindings for CUTLASS grouped GEMM.

Language:CudaApache-2.0000

hpsc-2024

Language:Shell000

levanter

Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax

Language:PythonApache-2.0000

llama-recipes

Examples and recipes for Llama 2 model

Language:Jupyter NotebookNOASSERTION000

llama3v

A SOTA vision model built on top of llama3 8B.

000

llm-jp-dpo

Language:PythonApache-2.0000

llm-jp-Megatron-DeepSpeed

Language:PythonNOASSERTION000

llm-node-tests

Language:Python010

megablocks

Language:PythonApache-2.0000

Megatron-LM-ABCI

NVIDIA Megatron-LM fork

Language:PythonNOASSERTION000

mistral-hackathon

Language:PythonApache-2.0000

ml-engineering

Machine Learning Engineering Open Book

Language:PythonCC-BY-SA-4.0000

MoEfication

Language:Python000

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonApache-2.0000

NeMo

NeMo: a toolkit for conversational AI

Language:PythonApache-2.0000

NeMo-Aligner

Scalable toolkit for efficient model alignment

Language:PythonApache-2.0000

NeMo-Megatron-Launcher

NeMo Megatron launcher and tools

Language:PythonApache-2.0000

okoge-kaz

010

ppcomp24

Language:C010

swallow-project-parper-graph

Language:Python000

torchtitan

A native PyTorch Library for large model training

BSD-3-Clause000

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonApache-2.0000

TSUBAME-4.0-hands-on

000