davidchern's starred repositories

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38520Issues:383Issues:1645

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:14295Issues:110Issues:343

efficient-kan

An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).

Language:PythonLicense:MITStargazers:3774Issues:33Issues:36

MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Language:PythonLicense:Apache-2.0Stargazers:2383Issues:17Issues:163

drl-zh

Deep Reinforcement Learning: Zero to Hero!

Language:Jupyter NotebookLicense:MITStargazers:1987Issues:11Issues:3

Convolutional-KANs

This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing the classic linear transformation of the convolution to learnable non linear activations in each pixel.

Language:Jupyter NotebookLicense:MITStargazers:683Issues:13Issues:11

LLMBox

A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.

Language:PythonLicense:MITStargazers:543Issues:6Issues:9

YuLan-Chat

YuLan: An Open-Source Large Language Model

Language:PythonLicense:MITStargazers:521Issues:5Issues:11

Score-Entropy-Discrete-Diffusion

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)

Language:PythonLicense:MITStargazers:333Issues:6Issues:10

ChebyKAN

Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.

Language:Jupyter NotebookStargazers:324Issues:8Issues:8

MoRA

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Language:PythonLicense:Apache-2.0Stargazers:319Issues:3Issues:11

fast-kan

FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:318Issues:2Issues:13

kanrl

Kolmogorov-Arnold Network for Reinforcement Leaning, initial experiments

Language:PythonLicense:MITStargazers:158Issues:6Issues:1

Steel-LLM

Train a Chinese LLM From 0 by Personal

Language:Jupyter NotebookStargazers:132Issues:4Issues:1

LKAN

Variations of Kolmogorov-Arnold Networks

Language:PythonLicense:MITStargazers:110Issues:3Issues:4

FCN-KAN

Kolmogorov–Arnold Networks with modified activation (using fully connected network to represent the activation)

Language:PythonLicense:MITStargazers:97Issues:1Issues:1

Token-level-Direct-Preference-Optimization

Reference implementation for Token-level Direct Preference Optimization(TDPO)

Language:PythonLicense:Apache-2.0Stargazers:83Issues:1Issues:4

infini-mini-transformer

This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and training code.

Hrrformer

Hrrformer: A Neuro-symbolic Self-attention Model (ICML23)

nanoXLSTM

The simplest, fastest repository for training/finetuning medium-sized xLSTMs.

Language:PythonLicense:MITStargazers:38Issues:1Issues:0

MemoryMosaics

Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.

Language:PythonLicense:Apache-2.0Stargazers:30Issues:5Issues:3

Bloom-Lora

Finetune Bloom big language model with Lora method

ferns

Fast Exact Retrieval for Nearest-neighbor Search

Language:Jupyter NotebookStargazers:11Issues:0Issues:0

NCHL

Neuron centric Hebbian Learning

Language:Jupyter NotebookStargazers:2Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0