YOKOTA Laboratory at Tokyo Tech's repositories
Megatron-Llama2
2023 ABCI Llama-2 継続学習プロジェクト
DeepSpeedFugaku
main: microsoft/Meagtron-DeepSpeed, cpu: 富岳上で動かすstableブランチ
PixPro-with-OpticalFlow
Pixel-level Contrastive Learning of Driving Videos with Optical Flow, CVPR 2023 Workshop
Megatron-DeepSpeed-Ylab
Ongoing research training transformer language models at scale, including: BERT & GPT-2
cutlass
CUDA Templates for Linear Algebra Subroutines
detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
FederatedLearning
An adaptable federated learning framework with a central server, supporting diverse datasets, models, and optimizers. Facilitates collaborative, yet private, data training with customizable aggregation algorithms.
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
grok-1
Grok open release
lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
PixPro
Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning, CVPR 2021
stars-h
Software for Testing Accuracy, Reliability and Scalability of Hierarchical computations.
STRUMPACK
Structured Matrix Package (LBNL)
ylab_server_public
ひなどりクラスタの使い方 (for public)
zero-bubble-pipeline-parallelism
Zero Bubble Pipeline Parallelism