Yu Zhang (yzhangcs)

yzhangcs

Geek Repo

Company: Soochow University

Location:Shanghai

Home Page:https://yzhang.site

Twitter:@yzhang_cs

Github PK Tool:Github PK Tool


Organizations
SUDA-LA

Yu Zhang's starred repositories

datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonLicense:Apache-2.0Stargazers:19021Issues:277Issues:2885

Bend

A massively parallel, high-level programming language

Language:RustLicense:Apache-2.0Stargazers:17230Issues:93Issues:246

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:13117Issues:93Issues:16

matmulfreellm

Implementation for MatMul-free LM.

Language:PythonLicense:Apache-2.0Stargazers:2879Issues:43Issues:29

Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks.

Language:Jupyter NotebookLicense:MITStargazers:2258Issues:16Issues:62

DeepSeek-Coder-V2

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

xlstm

Official repository of the xLSTM.

Language:PythonLicense:AGPL-3.0Stargazers:1243Issues:13Issues:43

streaming

A Data Streaming Library for Efficient Neural Network Training

Language:PythonLicense:Apache-2.0Stargazers:1079Issues:21Issues:166

attention-cnn

Source code for "On the Relationship between Self-Attention and Convolutional Layers"

Language:PythonLicense:Apache-2.0Stargazers:1077Issues:27Issues:10

gemma-2B-10M

Gemma 2B with 10M context length using Infini-attention.

StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Language:PythonLicense:MITStargazers:880Issues:12Issues:13

Samba

Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"

Language:PythonLicense:MITStargazers:780Issues:24Issues:16

magvit2-pytorch

Implementation of MagViT2 Tokenizer in Pytorch

Language:PythonLicense:MITStargazers:537Issues:28Issues:35

MS-AMP

Microsoft Automatic Mixed Precision Library

Language:PythonLicense:MITStargazers:508Issues:11Issues:63

Agent-Attention

Official repository of Agent Attention (ECCV2024)

Language:PythonLicense:Apache-2.0Stargazers:422Issues:11Issues:11

InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"

Language:PythonLicense:MITStargazers:274Issues:16Issues:46

transformer-sequential

Trains Transformer model variants. Data isn't shuffled between batches.

Language:PythonLicense:NOASSERTIONStargazers:140Issues:11Issues:3

seqax

seqax = sequence modeling + JAX

Language:PythonLicense:BSD-3-ClauseStargazers:130Issues:7Issues:2

LCKV

Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.

block-transformer

Block Transformer: Global-to-Local Language Modeling for Fast Inference (Official Code)

Language:PythonLicense:MITStargazers:118Issues:5Issues:4

triton-index

Cataloging released Triton kernels.

License:Apache-2.0Stargazers:111Issues:4Issues:0

uncheatable_eval

Evaluating LLMs with Dynamic Data

Language:Jupyter NotebookLicense:MITStargazers:66Issues:2Issues:2

MoE-SFT

🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Language:PythonLicense:Apache-2.0Stargazers:33Issues:1Issues:0

hypernetwork-attention

Official code for the paper "Attention as a Hypernetwork"

Language:PythonLicense:MITStargazers:20Issues:0Issues:0
Language:PythonStargazers:9Issues:0Issues:0

GL-DancingMen

Cipher font of "The Adventure of the Dancing Men", The Return of Sherlock Holmes by Arthur Conan Doyle.

License:NOASSERTIONStargazers:3Issues:1Issues:0