zhenyuhe00

Zhenyu He's starred repositories

ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonApache-2.029300

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

Apache-2.0306800

Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

88200

MEGABYTE-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch

Language:PythonMIT60000

ml4se

A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering

65100

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT302700

infini-transformer-pytorch

Implementation of Infini-Transformer in Pytorch

Language:PythonMIT9500

ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

Language:PythonMIT41300

RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Language:PythonApache-2.035600

mixture-of-depths

An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Language:PythonNOASSERTION3100

DVMP

The official implementation of dual-view molecule pre-training.

MIT300

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2333700

Transformer-M

[ICLR 2023] One Transformer Can Understand Both 2D & 3D Molecular Data (official implementation)

Language:PythonMIT19700

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT385400

zhenyuhe00

Zhenyu He's starred repositories

ChunkLlama

ShiArthur03

awesome-RLHF

Mooncake

MEGABYTE-pytorch

GeoMFormer

Panacea

ml4se

DeepSeek-V2

infini-transformer-pytorch

ring-attention-pytorch

RULER

Image_FMAP

SnapKV

mixture-of-depths

DVMP

O-GNN

llama3

Transformer-M

VAR

LESS

EasyKV

InfLLM

TOVA

H2O

fairseq2

EasyContext

JetMoE

BiPE

mergekit