Kyriection

Zhenyu (Allen) Zhang's starred repositories

grok-1

Grok open release

Language:PythonApache-2.049462 562 209

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonApache-2.031670 201 4900

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT23601 230 136

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.

Language:PythonMIT13383 97 374

mistral-src

Reference implementation of Mistral AI 7B v0.1 model.

Language:Jupyter NotebookApache-2.08772 116 115

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonMIT6989 44 998

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT4031 116 81

transformer-debugger

Language:PythonMIT4016 25 14

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonNOASSERTION2499 40 23

schedule_free

Schedule-Free Optimization in PyTorch

Language:PythonApache-2.01828 15 30

Awesome-Graph-LLM

A collection of AWESOME things about Graph-Related LLMs.

MIT1676 45 12

TimeSformer

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Language:PythonNOASSERTION1519 27 129

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonApache-2.01375 18 52

OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Language:Python1366 14 8

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonApache-2.01108 16 86

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookApache-2.01007 10 10

Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

Apache-2.0968 24 11

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonApache-2.0958 8 9

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

MIT861 36 5

Neural-Network-Diffusion

We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters

Language:Python770 18 18

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonApache-2.0595 18 7

long-llms-learning

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Language:Jupyter Notebook241 8 2

LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Language:PythonMIT222 4 6

scattermoe

Triton-based implementation of Sparse Mixture of Experts.

Language:PythonApache-2.0170 5 12

tinyBenchmarks

Evaluating LLMs with fewer examples

Language:Jupyter NotebookMIT131 3 10

SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Language:PythonApache-2.088 7 12

GRIFFIN

Language:Python30 20

resta

Restore safety in fine-tuned language models through task arithmetic

Language:Python25 2 1

Ms-PoE

"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.

Language:PythonMIT19 10 4