yukang2017

followers

following

stars

CUHK

Hong Kong

Organizations

dvlab-research

yukang's starred repositories

lloco

The official repo for "LLoCo: Learning Long Contexts Offline"

Language:PythonMIT7200

ring-flash-attention

Ring attention implementation with flash attention

Language:Python31700

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonApache-2.023100

MiniGemini

Official implementation for Mini-Gemini

Language:PythonApache-2.0253800

LSK3DNet

This is the official implementation of "LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels" (Accepted at CVPR 2024).

MIT1700

DeepCache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Language:PythonApache-2.058400

grok-1

Grok open release

Language:PythonApache-2.04765400

FollowYourClick

[arXiv 2024] Follow-Your-Click: This repo is the official implementation of "Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts"

CUHK-PHD-Thesis-Template

CUHK PhD Thesis Template

Language:TeX4700

notus

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach

Language:PythonMIT14700

ChunkLlama

Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonApache-2.018900

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonApache-2.077000

Expert_Sparsity

Language:PythonMIT4700

LWM

Language:PythonApache-2.0678800

Long-Context-Data-Engineering

Implementation of paper Data Engineering for Scaling Language Models to 128K Context

Language:Python30200

OLMo

Modeling, training, eval, and inference code for OLMo

Language:PythonApache-2.0392200

AnyTool

Language:PythonApache-2.015400

VIRL

Code for V-IRL: Grounding Virtual Intelligence in Real Life

Language:Python23800

DDSM

Denoising Diffusion Step-aware Models (ICLR2024)

Language:PythonMIT3800

code-act

Official Repo for paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji.

Language:Python17300

TravelPlanner

Dataset and code for the paper "TravelPlanner: A Benchmark for Real-World Planning with Language Agents"

Language:PythonMIT11000

LongAlign

LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation

Language:PythonApache-2.09600

OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Language:Python118900

AgentBoard

An Analytical Evaluation Board of Multi-turn LLM Agents

Language:SAS18000

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookApache-2.0184400

parameter-efficient-moe

Language:Python21100

S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Language:PythonApache-2.0144500

hydra-moe

Language:Python39400

ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Language:PythonApache-2.01078500

DeepSeek-MoE

Language:PythonMIT81400