Kyriection

followers

following

stars

The University of Texas at Austin

Austin, TX, USA

@KyriectionZhang

Zhenyu (Allen) Zhang's starred repositories

stochastorch

A Pytorch implementation of stochastic addition.

Language:PythonApache-2.0500

LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"

Language:PythonApache-2.0105000

FourierKAN

Language:PythonMIT69000

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookMIT1471300

infini-transformer-pytorch

Implementation of Infini-Transformer in Pytorch

Language:PythonMIT10000

LLM-FP4

The official implementation of the EMNLP 2023 paper LLM-FP4

Language:PythonMIT15900

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonBSD-3-Clause71500

searchformer

Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".

Language:Jupyter NotebookNOASSERTION29200

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT123300

SnapKV

Language:Python17200

CUDA_gemm

A simple high performance CUDA GEMM implementation.

Language:Cuda32500

Usage-of-the-8bit-Quantization-in-Neural-Network-Training

This repo has the script to reproduce the experiments in project 'Usage of the 8bit Quantization in Neural Network Training'.

Language:Python600

attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Language:PythonMIT46100

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2639400

qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Language:Jupyter NotebookMIT995400

megalodon

Reference implementation of Megalodon 7B model

Language:CudaMIT50200

improved-t5

Experiments for efforts to train a new and improved t5

Language:Python7600

Q-Hitter

Language:PythonMIT700

CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Language:CudaGPL-3.0119900

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookApache-2.0957300

SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Language:PythonApache-2.08800

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonApache-2.059500

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT2360200

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT403100

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonApache-2.0111100

schedule_free

Schedule-Free Optimization in PyTorch

Language:PythonApache-2.0182800

GRIFFIN

Language:Python3000

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonApache-2.095800

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.

Language:PythonMIT1338300

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonNOASSERTION249900