Yu Zhang (yzhangcs)

yzhangcs

Geek Repo

Company: Soochow University

Location:Shenzhen, Guangdong

Home Page:https://yzhang.site

Twitter:@yzhang_cs

Github PK Tool:Github PK Tool


Organizations
SUDA-LA

Yu Zhang's starred repositories

open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6848Issues:62Issues:175

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:5612Issues:36Issues:895

IP-Adapter

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4365Issues:61Issues:338

distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Language:PythonLicense:MITStargazers:3298Issues:65Issues:92

dm-haiku

JAX-based neural network library

Language:PythonLicense:Apache-2.0Stargazers:2830Issues:40Issues:246

consistencydecoder

Consistency Distilled Diff VAE

Language:PythonLicense:MITStargazers:2089Issues:23Issues:19

kernl

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1480Issues:27Issues:174

cuda_programming

Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch

Language:CudaLicense:GPL-3.0Stargazers:659Issues:19Issues:13

LongChat

Official repository for LongChat and LongEval

Language:PythonLicense:Apache-2.0Stargazers:495Issues:10Issues:37

gpt_paper_assistant

GPT4 based personalized ArXiv paper assistant bot

Language:PythonLicense:Apache-2.0Stargazers:432Issues:6Issues:10

ALCE

[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627

Language:PythonLicense:MITStargazers:405Issues:8Issues:19

AutoCompressors

[EMNLP 2023] Adapting Language Models to Compress Long Contexts

flash-fft-conv

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Language:C++License:Apache-2.0Stargazers:242Issues:16Issues:19

aft-pytorch

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

Language:PythonLicense:MITStargazers:226Issues:9Issues:6

triton-transformer

Implementation of a Transformer, but completely in Triton

Language:PythonLicense:MITStargazers:225Issues:15Issues:8

CoLT5-attention

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

Language:PythonLicense:MITStargazers:217Issues:8Issues:7

ModuleFormer

ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.

Language:PythonLicense:Apache-2.0Stargazers:216Issues:11Issues:5

REST

REST: Retrieval-Based Speculative Decoding, NAACL 2024

Language:CLicense:Apache-2.0Stargazers:143Issues:6Issues:12

Mirror

🪞A powerful toolkit for almost all the Information Extraction tasks.

Language:PythonLicense:Apache-2.0Stargazers:91Issues:5Issues:6

heinsen_sequence

Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)

torchscale

Transformers at any scale

Language:PythonLicense:MITStargazers:40Issues:0Issues:0

SeqBoat

[NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling

Language:AssemblyLicense:MITStargazers:30Issues:3Issues:1

RAN

RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23

Language:PythonLicense:MITStargazers:20Issues:2Issues:1

HGERE

Source Code for "Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks"

Language:PythonLicense:MITStargazers:20Issues:2Issues:4

fairseq-evo

Fairseq with transformer evolution

Language:PythonLicense:MITStargazers:4Issues:1Issues:0