Zhenyu (Allen) Zhang (Kyriection)

Kyriection

Geek Repo

Company:The University of Texas at Austin

Location:Austin, TX, USA

Home Page:zhenyu.gallery

Twitter:@KyriectionZhang

Github PK Tool:Github PK Tool

Zhenyu (Allen) Zhang's starred repositories

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49462Issues:562Issues:209

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:31670Issues:201Issues:4900

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:23601Issues:230Issues:136

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.

Language:PythonLicense:MITStargazers:13383Issues:97Issues:374

mistral-src

Reference implementation of Mistral AI 7B v0.1 model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:8772Issues:116Issues:115
Language:PythonLicense:Apache-2.0Stargazers:7096Issues:66Issues:71

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:6989Issues:44Issues:998

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonLicense:MITStargazers:4031Issues:116Issues:81

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonLicense:NOASSERTIONStargazers:2499Issues:40Issues:23

schedule_free

Schedule-Free Optimization in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:1828Issues:15Issues:30

Awesome-Graph-LLM

A collection of AWESOME things about Graph-Related LLMs.

TimeSformer

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Language:PythonLicense:NOASSERTIONStargazers:1519Issues:27Issues:129

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonLicense:Apache-2.0Stargazers:1375Issues:18Issues:52

OpenMoE

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonLicense:Apache-2.0Stargazers:1108Issues:16Issues:86

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1007Issues:10Issues:10

Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonLicense:Apache-2.0Stargazers:958Issues:8Issues:9

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

Neural-Network-Diffusion

We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:595Issues:18Issues:7

long-llms-learning

A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks

Language:Jupyter NotebookStargazers:241Issues:8Issues:2

LLMs-Finetuning-Safety

We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

Language:PythonLicense:MITStargazers:222Issues:4Issues:6

scattermoe

Triton-based implementation of Sparse Mixture of Experts.

Language:PythonLicense:Apache-2.0Stargazers:170Issues:5Issues:12

tinyBenchmarks

Evaluating LLMs with fewer examples

Language:Jupyter NotebookLicense:MITStargazers:131Issues:3Issues:10

SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Language:PythonLicense:Apache-2.0Stargazers:88Issues:7Issues:12
Language:PythonStargazers:30Issues:2Issues:0

resta

Restore safety in fine-tuned language models through task arithmetic

Ms-PoE

"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.

Language:PythonLicense:MITStargazers:19Issues:10Issues:4