Zhenyu (Allen) Zhang (Kyriection)

Kyriection

Geek Repo

Company:The University of Texas at Austin

Location:Austin, TX, USA

Home Page:zhenyu.gallery

Twitter:@KyriectionZhang

Github PK Tool:Github PK Tool

Zhenyu (Allen) Zhang's starred repositories

stochastorch

A Pytorch implementation of stochastic addition.

Language:PythonLicense:Apache-2.0Stargazers:5Issues:0Issues:0

LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:1050Issues:0Issues:0
Language:PythonLicense:MITStargazers:690Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:14713Issues:0Issues:0

infini-transformer-pytorch

Implementation of Infini-Transformer in Pytorch

Language:PythonLicense:MITStargazers:100Issues:0Issues:0

LLM-FP4

The official implementation of the EMNLP 2023 paper LLM-FP4

Language:PythonLicense:MITStargazers:159Issues:0Issues:0

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:715Issues:0Issues:0

searchformer

Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:292Issues:0Issues:0

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonLicense:MITStargazers:1233Issues:0Issues:0
Language:PythonStargazers:172Issues:0Issues:0

CUDA_gemm

A simple high performance CUDA GEMM implementation.

Language:CudaStargazers:325Issues:0Issues:0

Usage-of-the-8bit-Quantization-in-Neural-Network-Training

This repo has the script to reproduce the experiments in project 'Usage of the 8bit Quantization in Neural Network Training'.

Language:PythonStargazers:6Issues:0Issues:0

attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

Language:PythonLicense:MITStargazers:461Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26394Issues:0Issues:0

qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Language:Jupyter NotebookLicense:MITStargazers:9954Issues:0Issues:0

megalodon

Reference implementation of Megalodon 7B model

Language:CudaLicense:MITStargazers:502Issues:0Issues:0

improved-t5

Experiments for efforts to train a new and improved t5

Language:PythonStargazers:76Issues:0Issues:0
Language:PythonLicense:MITStargazers:7Issues:0Issues:0

CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.

Language:CudaLicense:GPL-3.0Stargazers:1199Issues:0Issues:0

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9573Issues:0Issues:0

SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Language:PythonLicense:Apache-2.0Stargazers:88Issues:0Issues:0

recurrentgemma

Open weights language model from Google DeepMind, based on Griffin.

Language:PythonLicense:Apache-2.0Stargazers:595Issues:0Issues:0

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:23602Issues:0Issues:0

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonLicense:MITStargazers:4031Issues:0Issues:0

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonLicense:Apache-2.0Stargazers:1111Issues:0Issues:0

schedule_free

Schedule-Free Optimization in PyTorch

Language:PythonLicense:Apache-2.0Stargazers:1828Issues:0Issues:0
Language:PythonStargazers:30Issues:0Issues:0

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonLicense:Apache-2.0Stargazers:958Issues:0Issues:0

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.

Language:PythonLicense:MITStargazers:13383Issues:0Issues:0

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonLicense:NOASSERTIONStargazers:2499Issues:0Issues:0