Dicardo Xue (DicardoX)

DicardoX

Geek Repo

Company:Shanghai Jiao Tong University

Github PK Tool:Github PK Tool

Dicardo Xue's starred repositories

FlexTensor

Automatic Schedule Exploration and Optimization Framework for Tensor Computations

Language:PythonLicense:MITStargazers:174Issues:0Issues:0

MAGIS

MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)

Language:PythonLicense:MITStargazers:34Issues:0Issues:0

TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Language:PythonLicense:BSD-3-ClauseStargazers:2483Issues:0Issues:0

brainstorm

Compiler for Dynamic Neural Networks

Language:PythonStargazers:42Issues:0Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:1770Issues:0Issues:0
License:MITStargazers:32Issues:0Issues:0

FTPipe

FTPipe and related pipeline model parallelism research.

Language:PythonStargazers:41Issues:0Issues:0

fairscale

PyTorch extensions for high performance and large scale training.

Language:PythonLicense:NOASSERTIONStargazers:3129Issues:0Issues:0

torchtune

A Native-PyTorch Library for LLM Fine-tuning

Language:PythonLicense:BSD-3-ClauseStargazers:3867Issues:0Issues:0

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonLicense:MITStargazers:4525Issues:0Issues:0

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:8283Issues:0Issues:0

xv6-labs-2022-solutions

MIT 6.828 (6.S081) (6.1810) xv6-labs-2022 实验的答案和解析

Language:CStargazers:101Issues:0Issues:0

corenet

CoreNet: A library for training deep neural networks

Language:PythonLicense:NOASSERTIONStargazers:6904Issues:0Issues:0

paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry leading model flop utilization rates.

Language:PythonLicense:Apache-2.0Stargazers:443Issues:0Issues:0

YHs_Sample

Yinghan's Code Sample

Language:CudaLicense:GPL-3.0Stargazers:267Issues:0Issues:0

LLaMA-Megatron

A LLaMA1/LLaMA12 Megatron implement.

Language:PythonLicense:Apache-2.0Stargazers:26Issues:0Issues:0

gradient-checkpointing

Make huge neural nets fit in memory

Language:PythonLicense:MITStargazers:2689Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:9813Issues:0Issues:0

veScale

A PyTorch Native LLM Training Framework

Language:PythonLicense:Apache-2.0Stargazers:561Issues:0Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49391Issues:0Issues:0

torchgpipe

A GPipe implementation in PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:796Issues:0Issues:0

LLMSys-PaperList

Large Language Model (LLM) Systems Paper List

Stargazers:556Issues:0Issues:0

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++License:NOASSERTIONStargazers:1464Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:5747Issues:0Issues:0

pytorch-OpCounter

Count the MACs / FLOPs of your PyTorch model.

Language:PythonLicense:MITStargazers:4818Issues:0Issues:0

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:1074Issues:0Issues:0
Language:PythonStargazers:262Issues:0Issues:0

MeZO

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333

Language:PythonLicense:MITStargazers:1013Issues:0Issues:0