cuiwenyao

followers

following

stars

null

myblog.cuimouren.cn

崔文耀's starred repositories

ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Language:GoMIT93175 552 4578

grok-1

Grok open release

Language:PythonApache-2.049481 564 209

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.021856 185 490

DB-GPT

AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents

Language:PythonMIT13491 116 1083

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT11327 159 306

fastllm

纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行

Language:C++Apache-2.03296 41 364

DecryptPrompt

总结Prompt&LLM论文，开源数据&模型，AIGC应用

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonApache-2.02577 24 27

LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Language:Jupyter NotebookNOASSERTION1499 16 25

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT1261 27 47

Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Language:Python1179 42 3

mamba.py

A simple and efficient Mamba implementation in pure PyTorch and MLX.

Language:PythonMIT953 7 39

mamba-chat

Mamba-Chat: A chat LLM based on the state-space model architecture 🐍

Language:PythonApache-2.0904 7 30

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonApache-2.0624 8 46

LongLM

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Language:PythonMIT600 10 37

Awesome-state-space-models

Collection of papers on state-space models

streamlit-echarts

A Streamlit component to render ECharts.

Language:PythonMIT527 8 32

lost-in-the-middle

Code and data for "Lost in the Middle: How Language Models Use Long Contexts"

Language:PythonMIT305 5 14

mega

Sequence modeling with Mega.

Language:PythonMIT297 126 16

accelerated-scan

Accelerated First Order Parallel Associative Scan

Language:PythonMIT152 8 6

gated_linear_attention

Language:PythonMIT96 6 8

DiJiang

[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear attention mechanism.

Language:Python94 5 7

LongICLBench

Code and Data for "Long-context LLMs Struggle with Long In-context Learning"

Language:PythonMIT88 3 4

DenseSSM

A repository for DenseSSMs

Language:Python88 3 3

hippogriff

Griffin MQA + Hawk Linear RNN Hybrid

Language:PythonMIT83 4 8

mamba-mini

An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivation. It is probably the code which is the most close to selective_scan_cuda in mamba.

Language:Python66 3 6

HGRN

[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Sequence Modeling

Language:Python61 2 2

mamba-triton

Language:Python42 1 1

rnn-icrag

Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"

Language:Python24 20

resonance_rope

[ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.

Language:PythonApache-2.021 20